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INTRODUCTION 



1. DETECTION OF CHANGE 

As sensory stimuli are experienced, adaptive neural mechanisms 
extract information from these events. However, the processes 
underlying this capability are not yet well understood and continue to 
inspire research efforts. Measures of brain activity when a change in 
stimulation occurs can be assessed with electroencephalography (EEG), 
event-related brain potential (ERP), and functional magnetic resonance 
imaging (fMRI) techniques. The chapters in this book are snapshots of 
the recent progress made with these methods. 

The central theme is the detection of change when stimulus 
parameters are well controlled. The main questions are: Where and how 
does neural change detection occur? Are similar processes elicited across 
modalities? How do these events contribute to cognition? Leading 
experts have reviewed these issues, with background material integrated 
into each chapter. Topics include analysis of mismatch negativity, 
P3a/P3b theory and sources, human lesion studies, how EEG reflects 
cognition, and stimulus binding. These areas serve as the backdrop for 
discussions of stimulus modality ERP effects, the conjoint use of fMRI 
methods, and neuroelectric models of attention, perception, and memory. 



2. ORGANIZATION AND CONTENTS 

The text covers the gamut of experimental studies using stimulus 
change paradigms, with clinical data augmenting the utility of the 
methods. The book’s chapters are organized around the major topics of 
MMN, P300, and EEG oscillations to provide a spectrum on how 
modem neuroimaging methods can measure stimulus change processing. 
The authors constructed the chapters as they deemed appropriate but 
were encouraged to write for a broad audience by reviewing results in 
their theoretical context. This goal was very well met, so that the 
contents are fresh and the literature distillations helpful and informative. 

The first section on MMN provides a very assessable precis of this 
huge ERP research area. Kujala and Naatanen lead off with a synopsis of 
the field that sets the stage for the subsequent chapters. The historical 
developments of the MMN, its theory, and clinical applications are 
clearly limned. Alho, Escera, and Schroger then examine the 
relationships among MMN, P3a, and reorienting negativity produced by 
auditory/visual stimulus interactions. The measurement precision and 
integrative results are illuminative and important. Heslenfeld describes 
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the background for and the data from a series of innovative studies that 
appear to elicit the elusive visual MMN. The results are exciting and the 
implications for future MMN work intriguing. Winkler thoughtfully 
tackles the fundamental assumptions underlying MMN by going 
“beyond the oddball paradigm.” How a deviant stimulus can be defined 
is discussed with compelling scholarly force in this provocative chapter. 

The second section extends these topics by dissecting the P300 into 
its constituent P3a and P3b subcomponents. Polich presents an overview 
of P300 theory and outlines how the P3a and P3b may interact. Stimulus 
novelty per se is not required for P3a generation under appropriate task 
conditions, so the psychological origins of this potential and the P3b can 
be reasonably inferred. Hartikainen and Knight cogently review ERP 
data from neurologically lesioned patients. The findings delineate how 
different brain structures contribute to P3a and P3b and are of keen 
theoretical interest. Opitz reports on P300 studies that integrate ERP and 
fMRI methods in normal subjects. The data from both approaches are 
constrained by current source density analysis to help isolate P3a and 
P3b neural loci in a technically rigorous fashion 

The third section focuses on EEG oscillations. Gevins, Smith, and 
McEvoy succinctly summarize EEG methods by highlighting how 
advanced techniques can magnify the sensitivity of this brain measure. 
The findings forcefully demonstrate that increased resolution and 
sophisticated analysis clearly abet cognitive neuroscience. Klimesch 
provides an informative review of the relationship between EEG and 
memory processes. Event-related desynchronization (ERS) data from 
sophisticated designs appear to reflect the genesis of memory formation. 
Hermann describes the technical basis of gamma activity and how it may 
underlie stimulus binding. The illustrative studies strongly support the 
excitement of the “gamma bandits” that the origins of perceptual 
consciousness can be measured. 



3. FINAL COMMENTS 

As this summary suggests, the book’s chapters encapsulate the 
recent findings on how electric and magnetic measures reflect detection 
of stimulus change. Putting this project together has been immensely 
stimulating and rewarding. I sincerely and very much thank all of the 
authors for their contributions and patient collegial support. The superb 
technical skills of Angela Caires and Nancy Callahan are gratefully 
acknowledged. I also thank Floyd Bloom for helping to make all this 
possible. 



John Polich 
La Jolla, California 
September, 2002 
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AUDITORY ENVIRONMENT AND CHANGE 
DETECTION AS INDEXED BY THE MISMATCH 
NEGATIVITY (MMN) 

ANU KUJALA AND RISTO NAATANEN 

Cognitive Brain Research Unit, Department of Psychology, University of Helsinki, Finland 



Mismatch negativity (MMN) is an automatic event-related brain potential 
(ERP) that reflects a change in auditory stimulation and provides a unique 
measure of central sound representation. The electrically registered MMN 
and its magnetic equivalent MMNm are elicited by a discriminable change 
in any repetitive aspect of auditory stimuli even in the absence of attention 
(NaatSnen et al., 1978; Nastanen, 1992). Moreover, the brain mechanisms 
generating the MMN response initiate an attention switch to sound change, 
and thus cause its conscious perception. Hence, MMN can be used to probe 
the emergence and accuracy of the cortical representations for present and 
past sound events. Furthermore, the MMN can be used as a means to assess 
deficits in central auditory processing for various clinical conditions. 

The auditory environment is almost continuously changing. A change 
can take place within the stream of sounds reaching the auditory system 
(e.g., in a speech stream or just in the background noise originating from the 
street). A change can also be an appearance of a new or unexpected sound 
(e.g., a new warning signal or a cough in the middle of chorus melody), a 
modulation in an ongoing familiar sound (e.g., a rise in speech voice), or 
even an omission of a repetitive sound in a sound stream. The common 
factor for all these situations is that before the change, a somewhat stable 
auditory environment existed. For a change to be detected, the context to 
which it is compared must be represented, even though this context usually 
is also changing, at least with respect to some of its features. How, then, 
does the brain form the auditory context and how is a change in it detected 
and distinguished as a potentially relevant event? 
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1. MMN: THE BRAIN’S AUTOMATIC RESPONSE 
TO CHANGES IN AUDITORY STIMULATION 

The MMN and its magnetic equivalent MMNm are elicited by any 
discriminable change in some repetitive aspect of auditory stimulation. In a 
traditional MMN paradigm, infrequent (deviant) stimuli occasionally 
replacing the repeating (standard) stimuli elicit an MMN, which peaks at 
100-200 milliseconds from change onset. This stimulus change produces a 
negative deflection in the ERP to deviant stimuli relative to the standard- 
stimulus waveform. The MMN is usually separated from other ERP 
components that the standard and deviant stimuli elicit by subtracting the 
standard ERP waveform from that to the deviant stimuli. Thus, the 
remaining difference waveform is related only to the stimulus change, 
although some contribution of changes in the exogenous components can in 
certain cases occur. 

Figure 1 illustrates how MMN amplitude and latency depend on the 
magnitude of the stimulus change. MMN amplitude is enhanced and latency 
shortened as the difference between the deviant and standard stimuli is 
increased. Furthermore, MMN is elicited even in the absence of attention, as 
when the subject is reading, watching a silent video, or performing a visual 
task, or even when a patient is in a coma state (Kane et al., 1993). Attention 
can, however, have some influence on the MMN amplitude, but its 
withdrawal does not abolish the MMN (Trejo et al., 1995; Woldorff et al., 
1991). 

It is important to emphasize that stimulus change causes the MMN, with 
the infrequent sounds alone eliciting no MMN (Korzyukov et al., 1999; 
Kraus et al., 1993; NaatSnen et al., 1989a). Accordingly, MMN can be 
produced by stimuli occasionally occurring too early in a stimulus sequence 
(Ford & Hillyard, 1981; Hari et al., 1989; Naatfinen et al., 1993a; Nordby et 
al., 1988) or when omitted from a rapidly presented stimulus train (Yabe et 
al., 1997). The MMN is therefore a response that reflects change detection in 
the automatic comparison of the present stimulus with the sensory-memory 
representation of the previous stimuli. Hence, MMN is not produced by 
deviant stimuli activating new afferent sensory elements unrelated to those 
activated by the standard stimuli. Furthermore, electrical (Giard et al., 1995), 
magnetic (for a review, see, e.g., Alho, 1995), and intracranial (Kropotov et 
al., 1995, 2000) recordings have shown that this change-detection process 
originates in the auditory cortices, with some evidence obtained for an 
additional right-hemispheric frontal MMN generator (Alho et al., 1994; 
Giard et al., 1990; Rinne et al., 2000). 
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Figure 1. (a) MMN to frequency deviation: grand-average difference waves obtained by 
subtracting ERPs to 1000 Hz standard tones from those to deviant tones with higher 
frequencies (see legend). Each deviant stimulus occurred among standard tones in separate 
stimulus blocks at a probability of 0.05 (data from 10 subjects). MMN amplitude increases 
and latency decreases with increasing frequency deviation, (b) MMN peak amplitude 
increases with increases in the magnitude of frequency deviation, (c) MMN peak latency 
decreases as the magnitude of frequency deviation increases (after Tiitinen et al., 1994). 
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2. CENTRAL SOUND REPRESENTATIONS AS 
INDEXED BY MMN 

The MMN can be elicited by a change in sound frequency (Sams et al., 
1985; Tiitinen et al., 1994), intensity (Lounasmaa et al., 1989; NSatanen et 
al., 1987a), or spatial locus of origin (Paavilainen et al., 1989; Schrdger & 
Wolff, 1996). The MMN is also elicited by changes in the frequency 
components of complex sounds such as phonemes (Aaltonen et al., 1987; 
1994; Aulanko et al., 1993; Kraus et al., 1995; NaatSnen et al., 1997) or 
other spectrally complex sounds (Alho et al., 1996; Winkler et al., 1998), 
such as chords (Tervaniemi et al., 1999). MMN is also elicited by changes in 
the temporal features of sound stimulation, such as duration (Kaukoranta et 
al., 1989; NSatanen et al., 1989b), rise time (Lyytinen et al., 1992), the 
temporal structure of sound patterns (Alho et al., 1993; 1996; NSatanen et 
al., 1993b; Schrdger, 1994; Tervaniemi et al., 1997; Winkler & Schrdger, 
1995), or shortening of the time interval between successive stimuli. Even 
violations of abstract relationships between the elements of auditory 
stimulation can evoke the change-detection process, as when an infrequent 
tone repetition occurs within a sequence of tones with a continuously 
descending pitch (Tervaniemi et al., 1994). Furthermore, MMN studies have 
shown that the traces of sound representations contain feature-integrated 
information (Gomes et al., 1997; Sussman et al., 1998; Takegata et al., 
1999). In these studies, subjects were presented with two types of standard 
tones differing from each other (e.g., in frequency and intensity). The 
deviants possessed one feature of each standard and therefore formed a 
deviant conjunction of the frequent levels of the two attributes to elicit an 
MMN. Taken together, these findings suggest that the cortical traces 
reflected by the MMN contain integrated spectral, temporal, and even 
abstract information about sound events. 



3. ACCURACY OF THE NEURAL 

REPRESENTATIONS FOR SOUNDS 

There is a narrow range of any standard stimulus feature within which a 
deviant stimulus elicits no MMN (NaatSnen & Alho, 1995, 1997). Figure 2 
illustrates this phenomenon, which is termed the representational width 
(Rw). For a typical young adult, the MMN Rw is about 0.5-2.0% for 1000 
Hz standard tones, so that deviant tones between the 995-1005 Hz 
(Rw=0.5%) frequency range usually do not activate the change-detection 
process. That is, the deviants elicit a small, nonsignificant MMN and are not 
usually consciously discriminated. The individual sharpness (informational 
specificity) of the sound representations can be, at least theoretically, defined 
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with the MMN; the narrower the Rw, the sharper and more stimulus-specific 
is the sound representation in the brain. 

MMN as a Function of Frequency Change 




Figure 2. Schematic illustration of representational width (Rw) along a sensory dimension. 
Rw is the range around the standard-stimulus level in a given sensory feature, within which a 
deviant stimulus elicits no MMN (after Naatanen & Alho, 1997). 

Figure 3 illustrates the relationship between behavioral discrimination 
ability and the MMN (Lang et al., 1990). This study employed three groups 
of 17 year old high-school students according to their behavioral pitch- 
discrimination performance (‘good’, ‘moderate’, ‘poor’), and recorded the 
MMN to frequency changes of different magnitude in a separate session. In 
the good-performer group, a frequency deviation of 19 Hz was enough to 
elicit MMN, whereas a deviation from 50 to 100 Hz was needed in the poor- 
performance group for MMN elicitation; the moderate performers fell 
between these two extreme groups. In subsequent studies, MMN amplitude 
was shown to correlate with the behavioral discrimination of rhythmic sound 
patterns (Tervaniemi et al., 1997) and within-category examples of a vowel 
(Aaltonen et al., 1994). Corresponding results were obtained in studies 
demonstrating the emergence of the MMN when subjects learned to 
discriminate a change in a complex spectro-temporal tone pattern (NSatanen 
et al., 1993b) or when subjects learned to discriminate different variants of a 
consonant-vowel (/da/) syllable (Kraus et al., 1995). Thus, MMN is sensitive 
to individual auditory differences, as well as to the emergence of 
discrimination capability. 
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Discrimination Ability 




Deviant 

— — Standard 

Figure 3. Grand-average ERPs for subjects who were ‘good’, ‘moderate’, or ‘poor’ in 
behavioral pitch-discrimination of infrequent deviant tones from standard tones in a difficult 
pitch discrimination task. MMN amplitude and latency (shaded areas) for the frequency 
deviation differs among the groups (after Lang et al., 1990). 



4. EMERGENCE OF SOUND REPRESENTATIONS 

As suggested by the loudness summation of tones (Scharf & Houtsma, 
1986) and the test-to-masking stimulus interval in backward-masking studies 
(Hawkins & Presson, 1986), the temporal window of integration in auditory 
perception has a duration of 150-200 milliseconds. Figure 4 schematically 
illustrates the formation of the cortical trace imderlying the soimd 
perception. The MMN has also been used in determining the time needed for 
the emergence of the central soimd representation. Winkler and NaatSnen 
(1992) presented tones that were followed by a masker tone presented 
afterwards at varying time intervals. MMN was elicited when the silent time 
interval between the standards and deviants and the masker was 150 
milliseconds or longer but not at the shorter intervals. The masker tone 
presumably prevented trace formation, since with time intervals shorter than 
150 milliseconds subjects were unable to behaviorally discriminate the 
deviants. Subsequent MMN studies (SchrSger, 1997; Yabe et al., 1997, 
1998; Sussman et al., 1999) have verified that the temporal window of 
integration, as estimated fi’om MMN data, has a duration corresponding to 
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that suggested by the behavioral studies. These findings suggest that this 
integration time is necessary for acoustical features to form a unitary 
auditory event, rather than being represented as static features for each time 
point (NaatSnen, 1992). 



Perception 




(output provided 
with time dimension) 



Figure 4. A schematic illustration of the emergence and decay of sound central auditory 
representation (CSR). First, sound attributes are rapidly mapped on the respective separate 
feature analyzers whose outputs are subsequently mapped on the neurophysiological 
mechanisms of sensory memory so that the basis for unitary sound perception emerges 
through feature and temporal integration. The emerging sound representation has a time 
dimension, as sounds are represented as events in time rather than as individual static 
features. The emergence of this sound representation provides the specific information 
contents for the sound percept (after Na^tanen & Winkler, 1999). 



5. SHORT-TERM AND LONG-TERM MEMORY 
TRACES FOR SOUND REPRESENTATIONS 

Modeling of the auditory environment in the form of sound 
representations is based on the memory traces of short-term sensory 
memory. The MMN is thought to reflect these traces and thereby provides 
an indicator for their development and possible neuroanatomical locations. 
The MMN is elicited when a discrepancy is found between the input from a 
deviant stimulus and the trace of the standard stimulus. This outcome, 
however, requires that the standard-stimulus trace has not decayed and, 
therefore, the standard sounds must be repeatedly presented at relatively 
short intervals. The duration of these short-term traces is estimated to be of 
the order of 10 seconds (Cowan et al., 1993; Sams et al., 1993; see also 
Naatanen et al., 1987b). MMN data also suggest that sensory memory can 
maintain more than one sound representation in parallel, such that at least 
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two (Sams et al., 1984; Winkler et al., 1996a) or even several (Gomes et al., 
1997; Ritter et al., 1995) memory traces can be simultaneously active. 

MMN ttid Discrimination Training 




Figure 5. GrantJ-average ERPs from Cz to standard serial sound patterns (thin lines) and to 
deviant patterns (thick lines) occurring randomly with a probability of 0. 1 . The standard and 
deviant sound patterns consisted of 8 consecutive tones of different frequencies and are 
illustrated at the bottom of the figure. In the deviant patterns, the frequency of the sixth tone 
(indicated by the arrow) was higher than in the standard patterns. ERPs were recorded during 
the early, middle, and late phases of a session in which sound patterns (1,200 in each phase) 
were presented to subjects who were reading. The performance of the subjects belonging to 
this group improved during the session in a sound-pattern discrimination test applied after 
each phase of the session. MMN (shaded area) first emerged and then increased in amplitude 
during the session (after Naatanen et al., 1993b). 

One presentation of the standard stimulus seems to be enough to 
reactivate the standard-stimulus trace if it has decayed to the extent that 
deviants no longer elicit MMN (Cowan et al., 1993; Winkler et al., 1996a). 
This finding indicates that the standard-stimulus trace formed in auditory 
short-term sensory memory has coalesced into a durable form of sensory 
memory (Cowan et al., 1993). Figure 5 illustrates this effect, with a longer- 
term learning effect for MMN generation demonstrated by using complex 
spectro-temporal sound patterns and a minor deviation in one tone 
component (Naatanen et al., 1993b). Passive (subjects reading) MMN- 
recording blocks were alternated with active (deviant detection) sound- 
discrimination blocks, which produced MMN emergence as subjects started 
behaviorally to discriminate deviants from standards. Thus, longer-term 
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memory traces are slowly formed with attentive training, as the passive 
exposure per se was not sufficient to cause MMN emergence. 



6. LANGUAGE-SPECIFIC SPEECH-SOUND 
TRACES 



Long-term auditory traces are presumably crucial for speech perception 
and serve as recognition patterns for speech-sound segments, in which some 
invariant relationships among the acoustic elements rather than sensory 
information per se is encoded (NSatanen, 2001). Figure 6 summarizes MMN 
findings that suggest the existence of permanent memory traces for native- 
language vowels (NSatanen et al., 1997). In this study, Estonian and Finnish 
subjects were presented with phoneme contrasts. When the standard 
phoneme Id (shared by both languages) was replaced by a phoneme of the 
Estonian language /6/, MMN was larger in Estonian than Finnish subjects, 
whereas it did not differ between the two groups when the deviant stimulus 
was a phoneme belonging to both languages (/6/ or /of). Corroborating 
results have been obtained with the French (Dehaene-Lambertz, 1997) and 
Hungarian (Winkler et al., 1999b) languages. In addition to phonemes and 
phonological units (Phillips et al., 2000; Dehaene-Lambertz et al., 2000), 
memory traces for words are also reflected by the MMN (Pulvermilller et al., 
2001), since larger-amplitude MMN was found in native Finnish speakers to 
a syllable change producing a two-syllable Finnish word than a non-word, 
with the effect being absent in subjects who did not understand Finnish. 

Learning a new language can be monitored with the MMN, which 
appears to reflect development of the phonemic cortical memory 
representations of the new language. MMN has been observed to a contrast 
between two Finnish vowels in adult Hungarians who were fluent Finnish 
speakers, but not in Hungarians who had no experience of Finnish (Winkler 
et al., 1999a). Hence, in those Hungarians fluent in Finnish, the cortical 
memory representations of the Finnish vowels had developed as they learned 
this language. In children, speech-sound traces for the mother tongue, as 
indexed by the enhanced MMN elicitation when the deviant stimulus is a 
phoneme of this language, are formed between 6-12 months (Cheour et al., 
1998a) or even earlier in infancy (Dehaene-Lambertz & Baillet, 1996) — well 
before the speech production properly starts. Taken together, these results 
demonstrate the sensitivity of MMN to language-related stimulus 
processing. 
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a MMN amplitude h MMN&MMNm 




C ECD strength, d Source mapping 

left hemlftDhfire 




MMNm 0 Finns Finnwh subject RH 



Figure 6. (a) MMN (mean + s.e.m.) peak amplitude (at Fz) in Finns and Estonians as a 
function of the deviant stimulus, arranged in the order of increasing F2 difference from the 
standard stimulus, (b) MMN (at Fz, solid lines) and MMNm (left hemisphere; broken line) 
peak latencies as a function of the deviant stimulus for Finns and Estonians, (c) Strength of 
the equivalent current dipole (ECD) modeling the left auditory-cortex MMN for the different 
deviant stimuli (n=9). (d) Left- and right- hemisphere MMNms of one typical Finnish subject 
for deviants /d/ and /6/ presented in contour (spacing 2 ff/cm) maps of the magnetic field- 
gradient amplitude at the MMNm peak latency. The squares indicate the arrangement of the 
magnetic sensors. The arrows represent ECDs indicating activity in the auditory cortex; the 
black dots in these arrows show the centers of gravity of the MMNm. Note that the prototype 
/6/ elicits a much larger MMNm in the left than in the right hemisphere, whereas non- 
prototype /6/ responses in both hemispheres are small and quite similar in amplitude (after 
Naatanen et al., 1997). 
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7. NEURAL LOCI OF SHORT- AND LONG-TERM 
AUDITORY MEMORY TRACES 

Several MMN studies suggest specialization within and between the 
hemispheres already at a very early and automatic level in processing 
different types of auditory information. The neural locations of the traces for 
the different properties of the same sound produce differences in the MMN 
sources for changes in these attributes (Giard et al., 1995; Levanen et al., 
1993; Paavilainen et al., 1991). Magnetic recordings of the MMNm elicited 
by a change in simple vs. complex sounds also suggest that the sensory- 
memory representations of these sounds are located in different parts of the 
auditory cortex (Alho et al., 1996). In addition, recent MEG results indicate 
that the traces for phonemes and chords matched in frequency differ in 
location in both hemispheres (Tervaniemi et al., 1999). This intra- 
hemispheric specialization has also been observed in patients with left- 
hemisphere lesions (Aaltonen et al., 1993): when the lesion was located in 
the posterior areas, no MMN to a phoneme change was elicited, whereas the 
frequency change of a simple tone elicited an MMN. In contrast, patients 
with anterior lesions showed an MMN to both phoneme and tone-frequency 
changes. Consistent with these findings, recent studies have indirectly 
located the traces of native-language vowels to the left hemisphere, in or 
near Wernicke’s area. For example, in Finnish subjects, the left auditory 
cortex dominated the MMNm elicited by the native-language deviant 
vowels, whereas similar but smaller MMNms were generated in the auditory 
cortices of both hemispheres when the deviant stimulus was a vowel of the 
Estonian but not Finnish language (NaatSnen et al., 1997). Additional 
evidence for left-hemispheric lateralization of vowel memory traces also has 
been reported by studies using different methodologies (ERPs, Rinne et al., 
1999; MEG, Gootjes et al., 1999; positron-emission tomography or PET, 
Tervaniemi et al., 2000). Finally, the left hemisphere also dominates the 
processing of longer phonetic units: a stronger MMNm was elicited by 
consonant vowel-syllable changes in the left than right hemisphere (Alho et 
al., 1998; Shtyrov et al., 1998; 2000), with complimentary fMRI data (Celsis 
et al., 1999) showing a stronger left-hemispheric than right-hemispheric 
activation to CV-syllable changes. 

These results imply that the acoustic features of the speech sounds are 
represented by memory traces formed in both hemispheres, whereas 
phonetic information (i.e., about the phonetic invariance; Aulanko et al., 
1993) is represented in the left hemisphere. Consistent with this assertion, 
MMN elicited by the syllable /da/ was larger over the left than right 
hemisphere when the change signalled a phonetic change, in contrast to 
similar MMNs over the left and right hemispheres when the same syllable 
signalled a pitch change (Sharma & Kraus, 1995). Thus, the different 
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features of the speech sounds, such as the linguistic versus prosodic features 
of speech, might be represented by separate traces. Consequently, the 
cortical traces probed by the MMN are excellent candidates for acting as 
recognition templates in speech perception (Naatanen, 2001). 



8. MMN AND ATTENTIVE CHANGE-DETECTION 

The formation of short-term sound representations enables neural 
maintenance of auditory information over a period of several seconds, 
thereby creating the recent auditory past. This presence of the immediate 
auditory history is often described as echoic memory (Cowan, 1988) and is 
needed for recognizing sound events from the continuously varying acoustic 
signal. The auditory context modeled in the sound representations (Winkler 
et al., 1996b) provides the reference information against which any change is 
detected — an event that usually results in an attention switch to this change. 
Several studies support the idea that MMN elicitation is related directly to 
involuntary attention switching. For example, Schrdger (1996) presented 
subjects with irrelevant auditory stimulation while they were performing an 
auditory primary task. When a minor frequency change, eliciting an MMN, 
preceded the to-be-detected target stimuli, reaction time was prolonged and 
hit rate decreased, indicating an attention shift to the irrelevant frequency 
change. Similar results were obtained with visual primary tasks (Alho et al., 
1997; Escera et al., 1998, 2000). Consistent with these observations are 
findings that implicate the contribution of the frontal cortex, which is known 
to have a central role in the control of the direction of attention to MMN 
generation (for a review, see Alho, 1995). The frontal MMN generator is 
usually stronger on the right than left hemisphere (Giard et al., 1990). 
Interestingly, the time course of this frontal activation is slightly delayed 
relative to the auditory-cortex activation (Rinne et al., 2000). Thus, the 
auditory-cortex processes underlying automatic change detection trigger the 
frontal mechanisms of involuntary attention switch (NSStanen, 1990). 

However, a sound change does not cause attention switch each time the 
change occurs, so that the stimulus change can remain consciously 
unperceived. This outcome occurs for several reasons: First, it may be that 
the deviant sound does not differ enough from that represented by the 
sensory-memory trace, either because the difference is very small or the 
neuronal trace is informationally too diffuse. That is, its representational 
width for the auditory attribute involved is too large in relation to the 
magnitude of the change (NSStanen & Alho, 1997). Second, the memory 
trace underlying the previous sounds may already have decayed, so that no 
automatic comparison process can take place. Third, when attention at the 
moment of change is intensively focused elsewhere, then the threshold for 
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attention switch is presumably elevated (Lyytinen et al., 1992; Naatanen, 
1991, 1992). Fourth, the excitability of the frontal MMN generator may be 
temporally decreased, for example, by alcohol (JaaskelSinen et al., 1996). 
The system appears to be finally tuned so as to engage attentional switching 
and produce MMN under conditions that ensure sufficient stimulus 
processing has occurred. 



9. MMN AS A CLINICAL INDEX FOR CENTRAL 
AUDITORY PROCESSING 

As the MMN process reflects the central sound representation underlying 
both sound perception and sensory memory, this ERP has been used to 
assess the specific nature and degree of auditory processing disorders as well 
as a probe for ascertaining the general state of the brain. The MMN can be 
recorded from newborns (Alho et al., 1990) and even from pre-term 
newborns (Cheour-Luhtanen et al., 1996) to provide a measure of central 
auditory processing for evaluation of children who are diagnosed or are at- 
risk for auditory processing deficits. For example, children with cleft-palate 
or other genetic disorders (Cheour et al., 1998b), infants with a risk for 
dyslexia (LeppSnen et al., 1997), or children suffering from dysphasia 
(Korpilahti & Lang, 1994), show abnormalities in MMN. If auditory 
processing problems can be detected early in infancy, more time is obtained 
and therefore increased efficiency of rehabilitation in these children may be 
possible. The early identification of children with or at-risk for language 
problems is also an important application of MMN, as the first 2-3 years of 
life are critical for language development. For example, comparative data of 
auditory sensory-memory system development, with respect to language 
development are now available for normal infants (for a review, see Kraus & 
Cheour, 2000). 

Recent MMN studies have indicated that the deficits of auditory 
processing underlying speech disorders may have a more general nature. In 
adult developmental dyslexics, MMN to a shortening of a tone interval in the 
midst of a 4-tone pattern was absent, whereas a well-developed MMN was 
elicited in controls — findings that implicate problems in the temporal 
integration of auditory information in dyslexics (Kujala et al., 2000). Similar 
results were obtained in left-hemisphere stroke patients, as the MMN to and 
the behavioral discrimination for duration decrements of harmonic tones was 
deteriorated (Ilvonen et al., 2001). The improvement of sound discrimination 
in cochlear-implant patients, as indexed by an increasing MMN amplitude 
after the implantation, further supports the important relation between the 
cortical discrimination accuracy and speech-perception ability (Ponton et al., 
2000). Thus, MMN can facilitate the identification of deficits in the very 
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basic cortical mechanisms of sound processing that appear to underlie 
speech-sound processing. 

MMN also can reflect general brain fimctioning in aging, Alzheimer’s 
disease, and Parkinson’s disease (for a review, see Pekkonen, 2000), as 
MMN sensory-memory traces decay at an accelerated rate. MMN has been 
used to assay frontal-lobe damage (Alho et al., 1994), closed-head injuries 
(Kaipio et al., 2000), and the right-hemispheric damage causing the neglect 
syndrome (Deouell et al., 2000). As noted above, MMN can be used to 
predict the awakening of coma patients (Kane et al., 1993, 1996; Fischer et 
al., 2000; Morlet et al., 2000). In addition, MMN has proven useful in 
assessment of psychiatric conditions, such as schizophrenia (Shelley et al., 
1991), depression (Ogura et al., 1993), and alcoholism (Ahveninen et al., 
2000; Polo et al., 1999), wherein central auditory processing appears to be 
compromised. More generally, MMN reflects the temporary fluctuations in 
the general brain function caused by alcohol ingestion that attenuates MMN 
amplitude (Jaaskelainen et al., 1998), and specifically its frontal 
subcomponent (Jaaskeiainen et al., 1996). 



10. CONCLUSIONS 

The brain’s automatic sound-change detection mechanism is indexed by 
the MMN, which is based on the sound-event representations carried by the 
auditory cortical memory traces. MMN is presumably elicited when a trace 
is formed by a deviant sound event in the sensory-memory system of the 
auditory cortex where an active sensory-memory trace for the repetitive 
aspects of the preceding auditory stimulation already exists (NaatSnen, 
1985). Hence, those sensory-memory neurons activated by the deviant sound 
that were not already involved in representing the standard stimulus (but 
were released from tonic inhibition by the emergence and presence of this 
representation), likely contribute to MMN generation (NaatSnen, 1990). This 
comparison process is automatic and requires no attention directed to the 
sounds, although it usually results in an attention switch to the eliciting 
sound change. 

MMN studies have demonstrated that acoustic features are represented in 
the sensory-memory system as a imitary sound event with integrated feature 
and temporal information (NaStanen & Winkler, 1999). Even the abstract 
regularities and rules of the constantly changing auditory environment are 
reflected by these cortical representations, which appear to correspond to the 
conscious perception of the sound event. This relation of the neural traces 
involved in MMN elicitation to conscious perception is confirmed by the 
generally good correspondence of MMN amplitude and latency with the 
behavioral discrimination performance (Lang et al., 1990; Tiitinen et al.. 
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1994; Winkler et al., 1997). Further, the formation phase of the 
representation can be related to sound perception, with the slowly decaying 
phase lasting for the few seconds underlying the sensory memory of the 
sound event (NaMnen & Winkler, 1999). Thus, the sound-event 
representations formed and stored by the cortical traces underlying MMN 
elicitation serve as a context for change detection and contribute to the 
conscious percept of the change. 

Magnetic and intracranial recordings have confirmed that the main 
generators of the MMN response are located in the auditory cortices 
(Kropotov, 2000; reviewed in Alho, 1995), with the fi’ontal generator related 
to the attention switch to the eliciting sound change (Giard et al., 1990; 
Rinne et al., 2000). Moreover, the neural substrates involved in MMN 
elicitation are spatially distributed according to the physical or abstract 
features of the sound events. These findings add to the growing evidence for 
specialization within and between the auditory cortices in representing sound 
events that differ in nature, as demonstrated by the differences in the 
strengths and locations/orientations of the MMN responses to different 
sound changes. 

The individual variation in the behavioral sound-discrimination ability is 
directly associated with the individual variation in MMN emergence, 
suggesting appreciable individual variation in the accuracy of the cortical 
sound representations. Differences in accuracy presumably depend upon the 
receptive-field width of the afferent neurons feeding information to the 
sensory-memory system (Naatanen & Alho, 1997). Besides this normal 
inter-individual variation in the accuracy of the sound representations, 
additional variability can stem from language or musical (Tervaniemi, 2000) 
expertise and problems in auditory processing. For example, the inaccuracy 
in some feature(s) of the sound-event representation(s) can affect the 
recognition of the speech sounds from the acoustic stream. Backward 
masking might also, in some conditions such as dyslexia (Kujala et al., 
2000) and chronic alcoholism (Ahveninen et al., 1999), be pathologically 
enhanced, thereby promoting the too rapid loss of sensory information 
contained by traces of preceding sound stimuli. 

In conclusion, MMN provides an objective tool for studying how the 
auditory environment is represented in the human brain and how the 
conscious perception of sounds is achieved. The accuracy, development, and 
cortical locations of the sound representations can be precisely delineated 
with the MMN. Finally, MMN is becoming quite useful for testing cortical 
sound-discrimination accuracy and related attention/memory-switch 
functions in a variety of clinical populations. 
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As first reported by Naatanen, Gaillard, and Mantysalo (1978, 1980), 
infrequent ("deviant") sounds occurring in a sequence of repetitive ("standard") 
sounds elicit the mismatch negativity (MMN) event-related brain potential 
(ERP), even when the listener is instructed to attend to other stimuli. MMN is 
seen as a negative-polarity displacement of the ERP to deviant sounds in relation 
to the ERP from standard sounds around 100-200 milliseconds from deviant- 
event onset (see Figure 1). As in the early reports of Naatanen and his colleagues, 
most subsequent MMN studies have applied tones as stimuli and deviancies in 
some simple feature (e.g., pitch, intensity, duration, or location) to elicit MMN 
(for a review, see NaMnen, 1992). However, MMN is also elicited by infrequent 
changes and irregularities in complex sounds, such as phonemes, syllables, 
chords, and tone patterns (for recent reviews, see Naatanen & Alho, 1997; 
Naatanen & Winkler, 1999; Schroger, 1997). 

The brain process that generates MMN is evidently triggered by a mismatch 
between a deviant auditory stimulus or event and a memory representation of the 
regularities in the preceding auditory stimulation (NSatanen, 1992; Winkler et al., 
1996; Winkler & Czigler, 1998). This interpretation is supported by results 
showing that infrequent sounds presented alone, without intervening repetitive 
sounds or a change in the beginning of a sound sequence, do not elicit the MMN 
(Cowan et al., 1993; Korzyukov et al., 1999; Kropotov et al., 2000; Naatanen et 
al., 1989). 

MMN has proven to be a successful tool for studying preattentive auditory 
perception and memory functions (Naatanen, 1995; Naatanen & Alho, 1997; 
Naatanen & Winkler, 1999; Ritter et al., 1995; Schrdger, 1997). For example, the 
speed of the preattentive processing of auditory stimulus changes that generates 
the MMN appears to explain the speed of active discrimination of these changes 
when the auditory stimuli are attended, as shown by the strong correlation 
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between the decrease of MMN peak latency and shortening of reaction times 
(RTs) with increasing stimulus deviance (Tiitinen et al., 1994). Such active 
discrimination of stimulus changes is reflected by the N2b, another negative ERP 
component, which overlaps MMN and is followed by the positive P3b (P300) 
component (Alho et al., 1992; Donchin & Coles, 1988; Naatanen, 1990, 1992; 
NaStanen et al., 1993; Sams et al., 1985; Sutton et al., 1965). However, the 
functional role of the brain process generating the MMN is likely associated with 
orienting response initiation (Ohman, 1979; Sokolov, 1975) to changes in the 
auditory environment (Naatanen, 1992; Naatanen & Michie, 1979; Schroger, 
1996). 

Figure 2 illustrates the fronto-centrally dominant scalp distribution of MMN, 
which is mainly explained by the sum of bilaterally generated auditory-cortex 
activity in these brain areas (Giard et al., 1995; Rinne et al. 1999a; Scherg et al., 
1989). Source modeling of MMNm, the magnetoencephalographic (MEG) 
counterpart of MMN, supports this interpretation (Alho et al., 1998; Hari et al., 
1984; Levanen et al., 1996). Intracranial MMN recordings have also indicated 
MMN generation occurs in the auditory cortex (Csepe et al., 1987; Halgren et al., 
1995a; Javitt et al., 1996; Kraus et al., 1994; Kropotov et al., 1995, 2000; Liasis 
et al., 1999). Additionally, studies applying functional magnetic resonance 
imaging (fMRI, Celsis et al., 1999; Opitz et al., 1999), positron emission 
tomography (PET, Tervaniemi et al., 2000), and event-related optical signals are 
also consistent with this source of MMN generation. (EROS, Rinne et al., 
1999b). Moreover, patients with temporal-cortex lesions demonstrated attenuated 
scalp-recorded MMNs (Aaltonen et al., 1993; Alain et al., 1998). 

However, scalp current density (SCD) analysis of MMN voltage distribution 
over the head suggests that MMN gets an additional contribution from the 
prefrontal brain areas (Deouell et al., 1998; Giard et al., 1990; Rinne et al., 2000; 
Serra et al., 1998; Yago et al., 2001). This finding is supported by MEG 
(Levanen et al., 1996) and fMRI recordings (Celsis et al., 1999; Opitz et al., 
2002). As prefrontal cortex has an important role in determining the direction of 
attention (Fuster, 1986; Stuss & Benson, 1989), the prefrontal MMN activity is 
thought to index the involuntary orienting of attention to a change in the acoustic 
environment detected by the auditory-cortex MMN mechanism (Giard et al., 
1990; Naatanen 1992; Naatanen & Michie, 1979). 

Figure 1 also illustrates the positive P3a response that often follows the MMN 
(Alho et al., 1998; Escera et al., 1998; Naatanen et al., 1982; Sams et al., 1985), 
which is associated with the actual switching of attention (Escera et al., 1998; 
Ford et al., 1976; Knight & Scabini, 1998; Squires et al., 1975; Woods, 1990). 
This association is suggested by results showing that large P3a responses are 
elicited by attention-catching, widely deviant, complex “novel“ sounds (e.g., a 
dog barking or a telephone ringing) even when they occur in a to-be-ignored 
auditory stimulus sequence (Alho et al., 1998; Escera et al., 1998; Woods, 1992; 
Woods et al., 1993). When a small change occurs in an unattended auditory 
stimulus sequence, P3a in the ERP of a deviant stimulus may be quite small in 
amplitude or may fail to be elicited, perhaps because small stimulus changes do 
not always catch attention (Alho et al., 1992, 1998; Escera et al., 1998; Sams et 
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al., 1985; Tiitinen et al, 1994). This lack of response is indicated by results 
showing that even when such changes elicit MMN, they are not always followed 
by heart-rate deceleration or skin-conductance increase controlled by the 
autonomic nervous system and commonly associated with involuntary orienting 
of attention (Lyytinen et al., 1992). According to SCD mapping of P3a (Yago et 
al., submitted), source modeling of P3a (Mecklinger & Ullsperger, 1995) and its 
MEG counterpart of P3a (Alho et al., 1998), intracranial P3a recordings (Alain et 
al., 1989; Baudena et al., 1995; Halgren et al., 1995a,b; Kropotov et al., 1995), 
and effects of local brain lesions on P3a (Knight, 1984, 1996; Knight et al., 
1989), the distributed cerebral network of involuntary attention switching 
activated by novel sounds includes at least the superior temporal, dorsolateral 
prefrontal, and parietal cortical areas, the parahippocampal and anterior cingulate 
gyri, and the hippocampus. 

However, infrequent auditory stimuli occurring outside the current focus of 
attention may cause involuntary attention switching without eliciting MMN, as is 
the case for infrequent sounds delivered without intervening standard sounds and 
for sounds beginning an auditory stimulus sequence after a relatively long silent 
period (N^tanen, 1992). These sounds do not elicit MMN but they evoke an 
enhanced N1 component (Cowan et al., 1993; Kropotov et al., 2000; Korzyukov 
et al., 1999; NSatanen et al., 1989), which peaks at about 100 millisecond from 
stimulus onset and is sensitive to stimulation rate (NSat^nen & Picton, 1987). A 
sound repeated at a high rate elicits only a small Nl, whereas sounds delivered at 
a low rate elicit a large Nl, due to enhanced activity of modality-specific and 
non-specific brain areas (Giard et al., 1994; Hari et al., 1982; N^Stanen & Picton, 
1987; NSatanen & Winkler, 1999). Moreover, in addition to MMN, a widely 
deviant sound in a sequence of repeating sounds (e.g., a novel sound among tone 
pips) may elicit an enhanced Nl, presumably because it activates a population of 
new feature-specific (e.g., frequency-specific) neurons in the auditory cortex 
(Alho et al., 1998; Escera et al., 1998). It has been suggested that although MMN 
is generated by a process initiating involuntary attention to auditory stimulus 
changes, the auditory Nl response is generated by a process involved in directing 
the focus of attention to onsets of new events in the auditory environment 
(NaatSnen, 1992). This suggestion is supported by findings that novel sounds that 
elicit an enhanced Nl also elicit a large P3a, indicating engagement of attention 
by these sounds (Alho et al., 1998; Escera et al., 1998; Woods, 1990). Finally, 
ERPs also provide an index for redirecting attention back to the current task after 
involuntary switching of attention away from this task. This redirecting process 
might generate the recently discovered “reorienting negativity** (RON; Schroger 
& Wolff, 1998a) following P3a response to deviant stimuli, as shown in Figure 1. 
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Figure L Event-related brain potentials (ERPs) recorded at the central midline scalp site (Cz) 
to task-irrelevant standard tones of 600 Hz, to 700-Hz deviant tones, and complex novel sounds 
infrequently replacing the standard tone (sound onset at 0 millisecond). Each sound was followed 
by a task-relevant visual stimulus (onset at 300 milliseconds) that was to be discriminated and 
responded to by the subject and therefore elicited the positive P3b component. Right: Difference 
waves obtained by subtracting the ERP following the standard tones from the ERP following 
deviant tones and from the ERP following novel sounds. The difference wave for deviant tones 
shows the mismatch negativity (MMN), followed by a small positive P3a response and the 
“reorienting negativity” (RON). The difference wave for novel sounds shows a negative wave, 
consisting of MMN and an enhanced N1 response, followed by a large P3a and by RON (after 
Escera etal., 1998). 

MMN 

150 ms 




Figure 2. Voltage distribution map for the fronto-centrally maximal mismatch negativity 
(MMN) elicited by infrequent, slightly higher deviant tones (700 Hz) occurring among repetitive 
standard tones (600 Hz) presented to a subject concentrating on a visual task. MMN amplitudes 
were measured around the MMN peak latency (150 milliseconds after stimulus onset) from 
difference waves obtained by subtracting event-related potentials (ERP) to standard tones at 
different scalp sites from those to deviant tones (cf. Figure 1). The head is viewed from above, the 
nose pointing upwards. Lighter shades of gray indicate more negative voltages and the small circles 
indicate the locations of scalp electrodes used to record MMN (after Yago et al., 2001). 
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!• EVENT-RELATED BRAIN POTENTIALS TO 

SOUND CHANGES DISTRACTING AUDITORY 
TASK PERFORMANCE 

Evidence for the involvement of the MMN and P3a generator mechanisms in 
involuntary attention switching has been provided by simultaneous registration of 
behavioral and ERP effects of auditory distractors. Measurements of ERPs to 
auditory stimulus changes and distraction of auditory task performance by these 
changes have been carried out with two different types of tasks. In two-channel 
selective-attention tasks, subjects are dichotically presented with sounds and 
instructed to selectively attend to the input of one ear and to respond to particular 
target sounds occurring in this input. Distraction is indicated by poorer 
performance when target stimuli are preceded by infrequent, deviant sounds in 
the unattended input relative to the performance when the targets are preceded by 
frequent, standard sounds in the attended or unattended input. In one-channel 
tasks, distracting and task-relevant aspects of stimulation are embedded in the 
same acoustic event. For example, subjects are presented with equiprobably 
occurring short and long sounds that can be of a frequent standard frequency or 
of an infrequent deviant frequency. The subjects’ task is to discriminate short 
from long tones and to disregard task-irrelevant frequency changes. Distraction is 
indicated by poorer duration-discrimination performance when a task-irrelevant 
frequency change occurs in the sound. 

For example. Woods and colleagues (1993) used a two-channel selective- 
attention task in which high-frequency tones were presented to one ear and low- 
frequency tones to the other in a random order. Infrequent long-duration tones 
and occasional novel sounds were randomly interspersed in the stimulus 
sequence. Subjects were instructed to attend to tones delivered to one ear and to 
make a button-press response to long-duration tones occurring among the 
attended tones. RTs to these target stimuli were prolonged by more than 300 
milliseconds when a target was preceded by a task-irrelevant novel sound even 
when the preceding novel sound occurred among the to-be-ignored tones. Novel 
sounds elicited MMNs followed by N2b and P3a responses both when they 
occurred among the to-be-attended tones and when they occurred among the to- 
be-ignored tones. No effects of duration deviants in the unattended channel on 
subsequent targets in the attended channel were reported. Longer-duration 
deviant tones in the attended and unattended input elicited MMNs, but the N2b 
and P3 responses were confined to duration-deviants (targets) in the attended 
input. This pattern of results suggests that sound changes in the unattended 
channel may cause involuntary switches of attention, but these changes have to 
be rather salient in order to become effective distractors. 

In another two-channel study (Schroger, 1996), pairs of tone stimuli (SI and 
S2) were presented, with subjects instructed to ignore SI (delivered to the left 
ear) and to make a go/no-go response to the subsequent S2 (delivered to the right 
ear). On most trials, the task-irrelevant SI was of standard frequency, but 
occasionally its frequency deviated slightly or widely from the standard 
frequency. Deviant tones elicited MMN followed by a small P3a. Distraction of 
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the processing of task-relevant S2 by a deviant SI was associated with prolonged 
RTs and attenuated N1 ERPs to S2 tones when preceded by a deviant SI tone. 
These results support a model according to which the auditory system possesses a 
change detection system that monitors the acoustic input and may produce an 
attentional "interrupt" signal when a deviant occurs. However, this involuntary 
capture of attention caused by a deviant sound is rather short-lived, since 
distraction was confined to short S1-S2 intervals (200 milliseconds) and did not 
occur with long ones (560 milliseconds). 

In one-channel distraction studies, even slight irrelevant physical changes in 
task-relevant sounds were found to cause large distraction effects. For example, 
in studies by Schroger and Wolff (1998a, b), subjects were to discriminate the 
duration of equiprobable tones that were of short (200 milliseconds) or long (400 
milliseconds) duration. Tones were of a frequent standard frequency (700 Hz) or 
infrequent deviant frequency (750 Hz), and this frequency variation had no task 
relevance. RTs in the duration-discrimination task were prolonged for deviant- 
frequency tones compared with standard-frequency tones. ERPs to deviant- 
frequency tones showed MMN and P3a to these tones. They also elicited similar 
MMNs in another condition in which attention was directed away from the 
sounds indicating preattentive registration of the deviant sound. However, these 
small frequency changes elicited no P3a when the sounds were not attended 
(Schroger et al., 2000; Schroger & Wolff, 1998b). It seems likely that when the 
tones were totally task-irrelevant, it was possible to ignore small frequency 
changes in them, as indicated by the lack of P3a to these changes. However, 
when the tones carried task-relevant sound information, task-irrelevant sound 
information could not be easily disregarded, as indicated by P3a to the small 
frequency deviances during the tone-duration discrimination task, suggesting that 
the frequency changes caught the listeners’ attention. 

In the one-channel experiments of Schroger and colleagues, the P3a response 
to frequency changes was usually followed by a fronto-centrally distributed 
negativity at the 400-600 milliseconds range, as shown in Figure 1. However, 
this negativity was confined to conditions in which subjects discriminated long 
sounds from short ones and did not occur when the sounds were ignored or when 
the deviation was made task-relevant. Therefore, it seems likely this negativity is 
associated with reorienting of attention towards the task-relevant aspects of 
stimulation following distraction (Schroger et al., 2000; Schroger & Wolff, 
1998b). For this reason, this negativity was named the "reorienting negativity" 
(RON). SCD maps for RON reveal bilateral frontal sinks around the fronto- 
central FCl and FC2 electrode sites, suggesting frontal generators and complex 
current fields of lower amplitudes over centro-parietal regions (SchrOger et al., 
2000). Interestingly, RON seems not to be modality-specific as it was observed 
even in an analogous visual paradigm (Berti & Schroger, 2001), as shown in 
Figure 3, and even in a cross-modal -auditory- visual distraction paradigm 
(Escera et al., 2001; see below). 
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Figure 3. Voltage distribution map for the frontally maximal reorienting negativity (RON) 
following an occurrence of deviant-frequency tone. Subjects were to discriminate between 
equiprobable short (200 milliseconds) and long (400 milliseconds) tones that were either of a 
repetitive standard frequency (1000 Hz) or, infrequently, of a deviant frequency (950 or 1050 Hz). 
RON amplitudes were measured at 510-520 milliseconds after tone onset from difference waves 
obtained by subtracting event-related potentials (ERPs) following standard-frequency tones from 
those following deviant- frequency tones (Figure 1). For other details, see Figure 2. Right: The same 
as at the left for RON following a deviant visual stimulus. Subjects were to discriminate between 
equiprobable visual stimuli presented for a short (200 milliseconds) or long (600 milliseconds) 
time. Frequent standard stimuli were green squares containing a gray triangle and infrequent 
deviant stimuli were similar squares, but with a mislocated triangle. RON amplitudes were 
measured at 540-550 milliseconds after figure onset from deviant-standard ERP difference waves 
(after Berti & SchrOger, 2001). 

It is important to emphasize that the one-channel distraction task yields 
reliable distraction effects even with very small difference between deviant and 
standard sounds. So far, each subject studied with this paradigm showed a 
behavioral distraction effect (Jaaskelainen et al., 1999; Schroger & Berti, 2000; 
Schroger et al., 2000; Schroger & Wolff, 1998a, 1998b). The behavioral and 
electrophysiological effects observed in this distraction paradigm are highly 
replicable. The product-moment correlations for MMN, P3a, and RON and the 
RT prolongation measured in two separate sessions were between 0.77 and 0.90 
(Schroger et al., 2000). In a very recent experiment (Schroger & Wolff, in 
preparation), the magnitude of frequency change was manipulated in five steps to 
determine the smallest frequency change still yielding behavioral distraction. The 
standard tone was of 1000 Hz and the deviant tones were 0.5%, 1.5%, 2.5%, 
3.5% and 4.5% lower or higher in frequency. The 0.5% deviant did not cause a 
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distraction effect. However, with a difference of only 1.5 %, the distraction effect 
produced a prolongation of RT that was about 20 milliseconds, with all 12 
subjects showing this effect. The effect was of similar magnitude in the 2.5% and 
3.5% conditions and somewhat larger in the 4.5% condition (32 milliseconds); 
each subject in each of these three conditions showed the RT effect (except in the 
2.5% condition where one subject did not). These results suggest that frequency 
deviation may become perturbing as soon as it is above the discrimination 
threshold. The finding that MMN, P3a, and RON developed together with the 
behavioral distraction effect provides further evidence for the hypothesis that 
these components are indeed functionally related to distraction observed at the 
behavioral level. 



2. EVENT-RELATED BRAIN POTENTIALS TO 
SOUND CHANGES DISTRACTING VISUAL 
TASK PERFORMANCE 

The involvement of the MMN and the P3a generator processes in involuntary 
attention may also be observed cross>modally as shown by Escera and colleagues 
(1998). In their study, subjects were instructed to discriminate between two 
categories of visual stimuli (odd and even numbers) presented in a random order 
at a constant rate (1 stimulus in 1.2 seconds) on a computer screen, and to press 
the corresponding response button as fast and accurately as possible. A task- 
irrelevant sound occurred 300 milliseconds before each visual stimulus. The 
sound was either a frequently occurring (600 Hz, p=0.8) standard tone, a slightly 
higher deviant tone (700 Hz, p=0.1), or a novel sound (p=0.1) drawn from a pool 
of complex environmental sounds. Figure 4 illustrates the RTs to visual stimuli 
following deviant tones and novel sounds that were about 5 to 20 milliseconds 
longer, respectively, in comparison to RT to visual stimuli that followed standard 
tones. Unexpectedly, the hit rate was similar after standard tones and novel 
sounds but significantly reduced (by about 2%) after deviant tones, which 
resulted from an increased number of wrong responses to the visual stimuli 
following deviant tones (Figure 4). These effects were replicated in several 
studies using the same or slightly modified versions of this auditory-visual 
distraction paradigm (Alho et al., 1997; Alho et al., 1999; Escera et al., 1998, 
2001, in press; JSaskelainen et al., 1996; Yago et al., 2001, submitted). 
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Figure 4. Performance speed and accuracy in the auditory- visual distraction paradigm showing 
that reaction times (RTs) are longer to to-be-discriminated visual stimuli following task-irrelevant 
frequency-deviant tones (DEV) and novel sounds (NOV) than to visual stimuli following standard 
tones (STD). Deviant tones are also followed by a hit-rate decrease in the visual performance due 
to an increased number of wrong responses. When the sounds were omitted and the visual stimuli 
were presented alone in a control condition (V), RTs to visual stimuli were slower than for the 
visual stimuli preceded by a standard tone. This outcome suggests that the sounds were not totally 
task-irrelevant but served as timing cues for visual stimuli and therefore speeded up the visual-task 
performance in the auditory- visual condition (after Escera et al., 1998). 

Figure 4 also indicates that the RTs were shorter to the visual stimuli 
preceded by a standard tone than to similar visual stimuli in a control condition, 
in which the sounds were omitted from the stimulus sequence (Escera et al., 
1998). This finding indicates that when the sounds were present, subjects 
covertly monitored the task-irrelevant auditory stimuli and used them as warning 
signals for the occurrence of the visual targets thereby speeding performance. 
Therefore, it may be argued that the observed effects of sound change on visual 
task performance did not truly reflect involuntary attention, as the subjects were 
covertly attending to the sounds. However, similar effects were observed in a 
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related experiment by Alho et al. (1997) in which the subject’s attention was 
more effectively directed away from the auditory stimuli and an occurrence of an 
auditory stimulus did not predict the presence of the subsequent task-relevant 
visual stimulus. This situation was achieved by presenting with each sound 
(either a standard or deviant tone) a simultaneous visual warning cue informing 
the subject of whether a successive task-relevant stimulus would be delivered or 
not. As in the Escera et al. (1998) study, prolonged RTs and increased error rates 
to visual stimuli following task-irrelevant deviant tones were observed, thereby 
confirming the involuntary nature of the attention capture by the deviant tones. 
Moreover, it was found that deviant tones preceding the visual target stimuli 
caused an attenuation of the occipital N1 in the ERPs to the visual targets. As 
cued visual stimuli elicit enhanced N1 responses due to enhanced attention to 
these stimuli (Mangun & Hillyard, 1991), the attenuation of the N1 to the cued 
visual targets caused by a preceding deviant tone indicates that the involuntary 
switching of the subject's attention to the deviant tone interfered with the early, 
attentive visual processing (Alho et al., 1997). 

ERPs recorded by Escera et al. (1998) showed that a deviant tone in the 
auditory-visual stimulus pair elicited MMN followed by a small P3a (see Figure 
1). These ERP results, in combination with the behavioral results discussed 
above, suggest that two different attention-switching mechanisms are involved in 
the behavioral distraction observed in the auditory-visual distraction paradigm. 
The N1 enhancement to novel sounds, in comparison with the N1 to standard 
tones, was probably caused by an enhancement of the supratemporal N1 
component (Naatanen & Picton, 1987) or some other N1 component related to 
attention, such as the frontal N1 described by Giard et al. (1994). However, 
MMN also appeared to contribute to the enhanced N1 deflection to the novel 
sounds occurring among standard tones, as recently demonstrated by Alho et al. 
(1998). Consequently, the switching of attention to novel sounds was probably 
triggered by a combined response of the transient-detector mechanism reflected 
by N1 and the stimulus-change detector mechanism reflected by MMN (cf. 
Naatanen, 1990, 1992; Naatanen & Picton, 1987). This process resulted in 
effective orienting of attention towards the novel sounds, as indicated by the 
subsequent large P3a wave illustrated in Figure 1 and the markedly delayed RT 
to the following visual stimulus illustrated in Figure 4. For deviant tones, the 
MMN and the subsequent small P3a suggest an attention switch initiated by the 
stimulus-change detector mechanism generating the MMN (Schrdger, 1996). 
Apparently, this mechanism is less effective in triggering attention switches, as 
indicated by the smaller P3a, and the more modest effect on the visual RT 
compared with that caused by the novel sounds. 

In the study of Escera et al. (1998), P3a response to novel sounds had two 
distinct subcomponents. Figure 5 illustrates that the early portion of P3a showed 
a centrally dominant scalp distribution with a polarity reversal at posterior and 
inferior lateral electrode sites, whereas the late portion of P3a displayed a frontal 
scalp maximum. The scalp distributions of these two P3a phases are in agreement 
with studies indicating multiple generator sources for P3a. The earlier, centrally 
maximal portion of P3a is probably dominated by contributions from posterior 
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sources in the temporal and parietal cortices, indicated by effects of local brain 
lesions on ERPs (Knight et al., 1989), by intracranial ERP recordings (Halgren et 
al., 1995a, 1995b), and by ERP and MEG source modeling (Alho et al., 1998; 
Mecklinger & Ullsperger, 1995). Additionally, the later frontally maximal 
portion of P3a is presumably dominated by activity generated in anterior, 
prefrontal sources indicated by lesion studies (Knight, 1984), intracranial 
recordings (Baudena et al, 1995), and source modeling (Mecklinger & 
Ullsperger, 1995). It is likely that the two phases of P3a reflect two different 
processes in the course of involuntary attention switching, as also suggested by 
their different sensitivity to attentional manipulations. For example, Escera et al. 
(1998) found that the early P3a to novel sounds was of similar amplitude when 
their subjects concentrated on reading a book and when they performed the visual 
task in the distraction paradigm with auditory-visual stimulus pairs. The late P3a 
was enhanced in amplitude in the latter condition, in which the task-irrelevant 
auditory stimuli were to some extent covertly attended, as discussed above. The 
neural generators of the early phase of P3a and its apparent insensitivity to 
attentional manipulation suggest the early P3a partly reflects further processing 
of stimulus changes in the auditory cortex (Alho et al., 1998) and some violation 
of a multimodal model of the external world maintained in the temporal-parietal 
association cortex (Yamaguchi & Knight, 1991). The late P3a, in turn, may be 
more closely related to the actual orienting of attention, as suggested by its 
dependence on attention and by its prefrontal origin. 



early P3a late P3a 

230 ms 330 ms 




Figure 5. Voltage distribution maps for the P3a component elicited by infrequent, complex 
novel sounds occurring in a sequence of repetitive standard tones and infrequent, slightly higher 
deviant tones presented to a subject concentrating on a visual task. P3a amplitudes were measured 
at 230 and 330 milliseconds after sound onset from difference waves obtained by subtracting the 
event-related potentials (ERP) to standard tones from those to deviant tones. Lighter shades of gray 
indicate more positive voltages. As seen from these maps, the later phase of P3a is more frontally 
distributed than the earlier phase. For other details, see Figure 1 (after Yago et al., submitted). 

The ERP data of Escera et al. (1998) also indicate that the P3a from novel 
sounds was followed by RON response, as shown in Figure 1 , which appears to 
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be associated with reorienting of attention back to the current task after a 
distracting event (Berti & Schrdger, 2001; Schroger & Berti, 2000; Schroger et 
al, 2000; Schrdger & Wolff, 1998a, b). Escera et al. (2001) recently provided 
evidence confirming the association of RON with reorienting of attention. They 
used a paradigm similar to that of Escera et al. (1998) except that a task- 
irrelevant auditory stimulus (that was either a standard tone, deviant tone or 
novel sound) occurred in different conditions that were either 245 or 355 
milliseconds before each visual target stimulus. It was found that a negativity, 
presumably RON, following P3a responses to the deviant tones and novel sounds 
peaked at 350 milliseconds from visual-target onset in the two conditions of 
different auditory-visual stimulus onset asynchrony. Thus, RON was evidently 
related to the processing of visual target stimuli rather than to the processing of 
preceding task-irrelevant sounds. 



3. CONCLUSION 

It has been long assumed that the brain activity reflected by MMN and P3a in 
ERPs to deviant sounds among repetitive standard sounds is associated with 
involuntary orienting of attention to auditory stimulus changes (NMatanen & 
Michie, 1979; Squires et al., 1975). However, the recent studies reviewed above 
measuring ERPs to infrequent changes in auditory stimuli and distracting effects 
of these changes on auditory and visual task performance have provided support 
for these early assumptions. These studies have also shown that involuntary 
attention to widely deviant auditory stimuli, e.g., novel sounds among tones, may 
be partly triggered by an enhanced N1 response to these sounds. Although MMN 
and N1 appear to indicate preattentive detection of a deviant or novel auditory 
event and initiation of an involuntary attention switch to such events, P3a is 
presumably associated with the actual consequent attention switching. Moreover, 
recent studies also suggest that reorienting of attention back to a task after a 
distracting stimulus change is reflected by RON, an ERP component following 
P3a after a distracting event. 

Converging evidence from ERP and MEG studies using source-localization 
methods, as well as SCD analysis, intracranial ERP recordings, and studies 
examining effects of local brain lesions on ERPs in conjunction with studies 
applying PET, fMRJ and EROS methods, indicate involvement of auditory and 
prefrontal cortices in the preattentive detection of auditory stimulus changes 
generating MMN and initiating an involuntary attention switch to these changes. 
According to MEG recordings, the enhancement of the N1 response to widely 
deviant sounds would also be explained by enhanced auditory-cortex activity, 
which might have a role, together with MMN, in initiation of involuntary 
attention to such sounds. Source localization of P3a, as well as intracranial P3a 
recordings and effects of brain lesions on P3a, in turn, suggests that a complex 
cerebral network, including areas in the prefrontal, temporo-parietal, and auditory 
cortices, as well as in the parahippocampal and anterior cingulate gyri and 
hippocampus, is activated during the actual attention switching to a deviant or 
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novel auditory event. Finally, SCD analysis of RON implies that prefrontal areas 
are also involved in directing attention back to the task after a stimulus change 
distracts the task performance. 

In addition to clarifying brain functions involved in involuntary attention to 
auditory stimuli, the experimental distraction paradigms described above have 
proven to be useful in studies on acute effects of alcohol on attention. 
Jaaskelainen et al. (1999) used the distraction paradigm developed by Schroger 
and Wolff (1998a) and found that both the RT prolongation and the P3a response 
caused by occasional task-irrelevant frequency changes in tones during a tone- 
duration discrimination task were reduced by a moderate dose of ethanol leading 
to about 0.04% blood alcohol concentration (BAC) relative to a placebo 
condition. Furthermore, Jaaskelainen et al. (1996) used the distraction paradigm 
developed by Escera et al. (1998) and found that the hit-rate reduction observed 
for visual target stimuli preceded by a deviant tone, as seen in Figure 4, was 
significantly smaller during mild ethanol intoxication (0.05% BAC) than in a 
placebo condition. Thus, even very small doses of ethanol that are generally 
regarded as not markedly deteriorating sensory-motor performance, e.g., driving 
a motor vehicle, suppress significantly the attention-capturing effects of changes 
in the auditory environment. 

The distraction paradigms described above also have been used to determine 
impairment of attention in clinical groups. Ahveninen et al. (2000), using the 
paradigm developed by Schroger and Wolff (1998a), found that the RT 
prolongation in a tone-duration discrimination task caused by task-irrelevant 
changes in tone frequency was significantly larger in chronic alcoholics than in 
the control subjects. This RT effect correlated positively with the MMN 
amplitude (r=0.7). Hence, the abnormal attentional reactivity to irrelevant sound 
changes observed in the alcoholics appeared to be caused by over-active 
preattentive change detection generating MMN. Polo et al. (1998, 1999), also 
applied the auditory-visual distraction paradigm developed by Escera et al. 
(1998) and observed enhanced P3a responses to both frequency-deviant tones 
and novel sounds in chronic alcoholics in relation to control subjects, indicating 
enhanced involuntary orienting auditory stimulus changes in the alcoholic 
patients. 

Finally, Kaipio et al. (in preparation) used the paradigm of Escera et al. 
(1998) to study closed-head injury patients with signs of distractibility during 
neuropsychological assessment. In these patients, the small auditory stimulus 
changes of the deviant tones were less effective in capturing attention compared 
to the healthy control subjects, as indicated by the absence of RT effects and 
RON after the deviant tones in the patients, whereas the controls showed similar 
RT distraction effects and RON after the deviant tones as in the study of Escera 
et al. (1998). The preattentive auditory deviance-detection system of the patients, 
however, worked quite normally, as indicated by approximately similar MMN 
and P3a responses in the patients and controls. The absence of RT effects and 
RON after deviant tones in the patients suggests that the patients were able to pay 
less attention to the auditory- visual stimulus relation than the controls. However, 
large auditory stimulus changes (novel sounds) were associated with an RT 
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distraction effect and RON indicating that the attention-catching novel sounds 
caused a normal involuntary attention switch. 

In conclusion, in addition to revealing the sequence of short-lived brain 
activations involved in involuntary attention and leading to distracted task 
performance, simultaneous measurements of ERPs and behavioral distraction 
caused by deviant sounds may provide valuable information on effects of ethanol 
and brain injuries on attention. Thus, combining behavioral and ERP measures 
appears to be a useful approach in studies on function and dysfunction of brain 
mechanisms involved in involuntary orienting of attention to changes in the 
auditory environment and in reorienting attention back to the distracted task. 
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1. INTRODUCTION 

When a deviant tone is presented within a sequence of repetitive 
("standard") tones, it evokes a specific response in the event-related potential 
(ERP) called mismatch negativity (MMN). This response lasts from about 
100 to 250 milliseconds after stimulus onset, is maximal over frontal/central 
scalp areas, and is thought to originate from temporal and frontal cortices 
(for reviews, see NaatSnen, 1990, 1992, 1995). NaatSnen defines the MMN 
as "the brain's automatic response to changes in repetitive auditory input" 
(1990, p. 201). Within the auditory modality, MMN has been observed to 
changes in tonal frequency, intensity, duration, spatial location, and many 
other auditory stimuli parameters. An important part of its definition is the 
automaticity of the response — its independence of attention. Indeed, it has 
been shown many times that the auditory MMN is unaffected to a large 
degree by the difficulty (or load) of a concurrent task in the visual modality 
(Alho et al., 1992; Sams et al., 1985, Ritter et al., 1995). Note that this 
independence of attention does not imply that the auditory process 
underlying MMN occur without any attention. An alternative possibility is 
that the auditory process has its own, modality-specific attentional resource 
(Wickens, 1984), in which case MMN will be unaffected by the difficulty of 
a task in a different modality. Indeed, effective withdrawal of auditory 
attention by a concurrent task in the auditory modality has been shown to 
reduce the MMN (Alain & Woods, 1997; Woldorff et al., 1991, 1998). 

It is still unclear whether a comparable MMN component can be obtained 
in the visual modality. Nyman et al. (1990) presented sine wave gratings at a 
rate of one in 490 milliseconds to the central (2°x2°) visual field. The 
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standard gratings (90%) had a high spatial contrast of 0.72, whereas the 
deviant gratings (10%) had a lower spatial contrast of 0.24. Between 200 and 
300 milliseconds after stimulus onset, ERPs elicited by the low contrast 
deviant stimuli were negatively displaced with respect to ERPs elicited by 
high contrast standard stimuli. However, when the mapping of contrasts on 
stimulus probabilities was reversed, the negative displacement occurred to 
all low contrast stimuli irrespective of their probability. Hence, the negative 
displacement was not due to the deviance of the contrast but rather to the 
low contrast as such — that is, it was an exogenous effect. However, the 
negative displacement in the control experiment occurred between 100 and 
200 milliseconds, which leaves the 200-300 millisecond effect in the main 
experiment unexplained. 

Czigler and Csibra (1990) presented a small (76'x46') black rectangle 
once in 417 milliseconds to the central visual field. On 10% of the trials, the 
rectangle was slightly thicker than on the remaining 90% of the trials. The 
subjects did not notice the difference between standard and deviant 
thickness, and the ERP waveforms did not differ from each other. After the 
subjects had been notified of the difference, a small negative response (210- 
240 milliseconds) to deviant stimuli was observed in a second block. In 
addition, a deviant orientation of two easily visible angles inside the 
rectangle always led to negative deflections between 90 and 180 
milliseconds and between 210 and 270 milliseconds. Whereas the responses 
after 200 milliseconds are likely originated from attention-related processes 
(Harter & Guido, 1980), the earlier 90-180 millisecond negativity might be a 
visual analogue of the auditory MMN (see also Czigler & Csibra, 1992). No 
control for possible exogenous effects of line orientation was imposed, so 
that the observed early difference might also be due to the physical 
difference between standard and deviant stimuli (Harter et al., 1980). 

Woods et al. (1992) presented auditory and visual stimuli at a mean rate 
of one in 300 milliseconds. The visual stimuli were vertical gratings of either 
0.7 or 2.0 cycles per degree (cpd) and were flashed either to the left or right 
of fixation (at 4.7° eccentricity). The standard stimuli 90% were slightly 
taller than wide (3.9°x4.4°), the remaining 10% were shorter (about 
3.9°x3.6°). In different blocks, either visual or auditory deviants required a 
response. A negative deflection was observed to stimuli with the deviant size 
(or shape) at contralateral posterior scalp sites from about 160 to 360 
milliseconds. This negativity occurred in both the auditory and visual 
attention conditions, but was 3.3 times larger during visual (intramodal) 
attention. Although this response may be a visual MMN, it may also 
exogenously reflect the constant difference in size between standard and 
deviant stimuli (Harter et al., 1976). 
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In a companion paper, Alho et al. (1992) flashed the 2.0 cycles per 
degree (cpd) gratings to the central visual field: standard stimuli (80%) were 
again taller than wide (3.9°x4.4°), 10% were slightly shorter (3.9°x3.9°) 
small deviants, and 10% were considerably shorter (3.9°x2.4°) large 
deviants. In different blocks, either small or large deviants in either the 
auditory or visual modality required a response. The small visual deviants 
elicited a negativity only when they were targets and only after 270 
milliseconds. The large visual deviants elicited negativities at 120, 200, and 
270 milliseconds, which were enhanced when the visual stimuli were 
attended. The first response at 120 milliseconds, however, was unaffected by 
intermodal attention, which may point to an automatic processing of the 
deviant visual feature. Again, however, the deviant stimuli always had a 
different size than the standard stimuli, so that these effects might be due to 
the different stimulus parameters. A control experiment presented the large 
deviants alone (i.e., without intervening standards), and found the same 
negative responses as when the deviants were embedded within standards 
(Alho et al., 1992). This result strongly suggests an exogenous cause of the 
observed effect. 

Tales et al. (1999) presented either one thick bar (2.2°x .68°) or two 
thinner bars (2.2°x.34°) every 612-642 milliseconds to the upper and lower 
part of a computer screen. Subjects' attention was directed to the center of 
the screen by asking them to detect an occasional change of a color patch. In 
experiment 1, the thick bars were presented frequently (88.9%) and the 
thinner bars infrequently (5.6%); in experiment 2, this mapping was 
reversed. They found a late and long-lasting negativity (250-400 
milliseconds) at all posterior electrodes to deviant stimuli that was 
independent of which stimulus served as deviant. There was no control over 
the attention of the subjects, other than that they had to detect an occasional 
(5.6%) change of a color patch at the center of the screen. Moreover, the 
long latency and broad scalp distribution of the effect highly resembles the 
classical visual selection negativity (Harter & Guido, 1980; for a review, see 
Heslenfeld et al., 1997). Thus, this result more likely reflects attentive 
processing of deviant stimuli rather than an early, automatic, visual 
mismatch negativity (cf Czigler & Csibra, 1990). 

In sum, previous studies of visual deviance-related processes either have 
observed no effect at all, or have failed to control for confounding 
exogenous stimulus parameters. In addition, the stimulus deviance 
dimension in previous studies was always relevant to the task during some 
blocks (i.e., during some visual attention conditions), so that it cannot be 
excluded that stimuli were differentially processed in this respect even 
during auditory attention conditions. In the present experiment, exogenous 
factors are controlled by comparing ERPs to physically identical stimuli that 
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were either standards or deviants in different blocks. None of these stimuli 
were ever task-relevant, excluding any residual task-related processing by 
deviant stimuli. Finally, three levels of difficulty of a concurrent task within 
the same visual modality were employed to assess the automaticity of 
possible deviance-related effects. 



2. METHODS 



2.1 Subjects 

Fourteen subjects (age range 18-24, mean age 20.9 years, 10 females, 12 
right-handed) participated for course credits. All had normal or corrected-to- 
normal vision; none reported any history of neurological or psychiatric 
disease. 



2.2 Stimuli, Task, and Procedure 

The only task of the subjects was a compensatory visuo-motor tracking 
task. A small bright rectangle was continuously visible at the center of the 
computer screen, which moved constantly and unpredictably either to the 
left or right. The task of the subject was to keep the rectangle in the middle 
of the screen by means of compensatory button presses with the left and 
right index finger. The speed of rectangle movement and the frequency of its 
spontaneous changes of direction were linearly adjusted in order to create 
three levels of task difficulty. Relative to the "easy" condition, the speed and 
frequency of direction changes were doubled in the "moderate" condition, 
and tripled in the "difficult" condition. A total of 24 blocks each lasting 3.5 
minutes were presented to each subject, whose only task during the entire 
experiment was to keep the moving rectangle in the middle of the screen. 

Two types of task-irrelevant probe stimuli were presented: (a) Every 4-14 
seconds, the screen became blank for 33.3 milliseconds, which appeared as a 
brief white flash. The data to these flashes will not be discussed, (b) In 
addition, white-on-black vertical square wave gratings were presented 
simultaneously to the upper and lower 5.6° of the computer screen in 12 of 
the 24 blocks. A central horizontal bar 45' in height was not stimulated to 
avoid interference with the horizontally moving rectangle. This area was 
constantly demarcated by two thin (1.5') white horizontal lines. Two small 
rectangles (4.5'x7.5') were attached to these lines to continuously indicate 
the mid-screen target position. 
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The gratings had either a high (2.3 cpd) or low (0.58 cpd) fundamental 
spatial frequency, and they were presented for 16.7 milliseconds once in 
350-450 milliseconds (rectangular distribution), with a contrast of about 
20%. In 6 of the 12 blocks, the low spatial frequency grating was presented 
in 80% of the trials, and the high spatial frequency grating in the remaining 
20%. In the other 6 blocks, the presentation rates were reversed. A total of 
400 trials with the standard spatial frequency and 100 trials with the deviant 
spatial frequency were randomly intermixed in each block, with the 
exception that a deviant grating was always preceded by a standard grating. 
Task difficulty was constant for four blocks in a row, after which it was 
changed to another level. Two of these four blocks contained only flashes, 
the other two contained flashes and gratings. Blocks with only flashes and 
blocks with flashes and gratings were alternated. After 12 blocks (4 blocks at 
each difficulty level), the entire sequence of events was replicated with 
another 12 blocks. The sequences of task difficulties, whether or not there 
were gratings in a block, and which grating was the standard, were pseudo- 
randomized and counterbalanced over subjects. Subjects sat in a silent, 
dimly illuminated room, and were trained on all task difficulties before the 
experiment began. 

2.3 Recordings 

Electroencephalographic (EEG) data were recorded from eight tin 
electrodes mounted in an elastic cap (at locations Fz, Cz, Pz, Oz, C3, C4, T5, 
T6), and one electrode on the left mastoid all referenced to an electrode on 
the right mastoid, with impedances kept below 5kf2. Vertical electro- 
oculogram (EOG) activity was recorded bipolarly from above and below the 
right eye, and horizontal EOG from the outer canthi of each eye. Data 
acquisition was continuous, with a sampling rate of 250 Hz and band pass 
filtering at 0.08-35 Hz. The left and right button presses (binary responses) 
and the position of the rectangle on the screen (in pixels) were also recorded. 

2.4 Data analysis 



2.4.1 Performance 

The root mean square of the lateral position of the moving rectangle was 
computed to estimate the performance during each block. That is, deviations 
of the rectangle from the mid-screen target position were squared and 
averaged over time, separately for each block. The square roots of these 
means were averaged across replications. The performance data were 
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analyzed by a Task Load (easy, moderate, difficult) x Standard Spatial 
Frequency (high, low) multivariate analysis of variance (MANOVA). 
Hotelling's 7^ test was used to assess effects involving Task Load. 

2.4.2 Evoked Potentials 

Time series of 512 milliseconds duration were extracted off-line from the 
continuous data for each of the 14 channels, time-locked to the onset of a 
grating. The first 60 milliseconds prior to the onset of the grating defined the 
pre-stimulus baseline. All EEG channels were re-referenced to the algebraic 
mean amplitudes of the two mastoids. To reduce the low frequency drifts 
that might occur if subjects followed the moving rectangle with their eyes, 
linear trends were fitted and removed from all raw EEG and EOG time 
series (Forges & Bohrer, 1990). EOG artifacts were removed from the EEG 
by the method of Woestenburg et al. (1983). Trials containing amplifier 
saturations, peak-to-peak amplitude differences larger than 120 pV, or 
sample-to-sample amplitude differences larger than 25 pV, were discarded. 
In addition, the first 10 trials of each block were excluded from the averages. 

Trials were averaged according to task difficulty, spatial frequency, and 
stimulus probability, and then were averaged across replications. For each 
EEG channel, 18 mean amplitude measures of 20 milliseconds duration 
(from 40 to 400 milliseconds post-stimulus) were computed, which reflected 
the mean of the recorded voltage in each condition for each scalp site and 
time interval. These data were analyzed by MANOVAs, with the factors 
Task Load (easy, moderate, difficult). Spatial Frequency (high, low). 
Deviance (standard, deviant), and Channels (3 levels). The analyzed 
channels were Fz, Cz, Pz (midline), and T5, Oz, T6 (posterior row, in a 
separate analysis). If necessary, scalp distributions were normalized such 
that their multivariate vector lengths equaled one (McCarthy & Wood, 
1985). Hotelling's 7^ was used for all tests involving factors with more than 
2 levels. Because of the large number of tests, the critical a level was set to 
0.01. Since almost all interesting effects occurred at Oz between 60 and 100 
milliseconds, and at Fz and Oz between 120 and 180 milliseconds, separate 
Task Load x Spatial Frequency x Deviance analyses were performed for 
these mean amplitude measures. The critical a level for these additional 
analyses was 0.05. 
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3. RESULTS 



3.1 Performance 

Figure 1 displays the means and standard errors (across 14 subjects) of 
the performance data in the three task load conditions, separately for blocks 
in which high or low spatial frequency gratings were standards. Data are 
shown in pixels; 40 pixels correspond to 1° of visual angle. There was a 
main effect of Task Load (F(2,12)=9 19.72, /?<0.001), and a main effect of 
Spatial Frequency (F(l,13)=7.49, p<0.017), but no interaction. A separate 
2x2 analysis involving the "easy" and "moderate" conditions showed only a 
main effect of Task Load (F(l,13)=1412.03, /?<0.001), but no effects of 
Spatial Frequency. A 2x2 analysis involving the "moderate" and "difficult" 
conditions revealed main effects of Task Load (F(l,13)=184.33, /7<0.001) 
and Spatial Frequency (F(l,13)=5.46, p<0.037), but no interaction. Thus, 
performance in both the "easy" and the "difficult" conditions differed from 
the "moderate" condition, and in the more difficult conditions low spatial 
frequency standards interfered more with the task than high spatial 
frequency standards. 



RMS Error 




Figure 1. Performance data for each level of task difficulty for high and low standard spatial 
frequencies (task-irrelevant probes). Shown are means and standard errors of the root mean 
squared deviations of the moving rectangle from the target mid-screen position (n=14). Data 
are given in pixels; 40 pixels correspond to 1° of visual angle. 
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3.2 Evoked Potentials 

Figure 2 displays the grand average (over 14 subjects) evoked potentials 
at eight electrodes to each grating at each probability level in the "moderate" 
task load condition. There was a large negative response to the high spatial 
frequency gratings at Oz between 50 and 100 milliseconds, which seemed to 
be larger for deviant than standard high spatial frequency gratings and 
appears to correspond to the early exogenous Cl component (Smith & 
Jeffreys, 1978). This response was followed at Oz by both spatial frequency 
and deviance effects from about 100 to 200 milliseconds. In addition, there 
are deviance effects at Fz and Cz in the same latency range, which seem 
larger for the low than the high spatial frequency gratings. 



Standard High Spatial Frequency 
Standard Low Spatial Frequency 
Deviant High Spatial Frequency 
Deviant Low Spatial Frequency 



4pV 



Moderate 



Fz 

pas. 





Figure 2. Grand averaged evoked potentials at 8 scalp sites to low and high, standard and 
deviant spatial frequencies from the intermediate level of task difficulty (n=14). 
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Figure 3. Grand average differences between responses evoked by high and low spatial 
frequencies at three levels of task difficulty (n=14). 

Figure 3 shows the differences between responses to high and low spatial 
frequencies, pooled over stimulus probabilities, separately for each task load 
condition. Starting with the posterior channels (T5, Oz, T6), there was an 
interaction between Spatial Frequency and Channels from 60 to 100 
milliseconds (smallest F(2,12)=22.93, /><0.001), accompanied by a main 
effect of Spatial Frequency from 80 to 100 milliseconds (F(l,13)=24.88, 
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/><0.001), reflecting a larger negative response to high spatial frequency 
gratings, which was larger at Oz than at T5 or T6. There was a second 
interaction between Spatial Frequency and Channels, plus a Spatial 
Frequency main effect, between 160 and 200 milliseconds, reflecting a 
larger positive response to high spatial frequency gratings, which was again 
larger at Oz than at T5 and T6 (smallest F(2, 12)= 11.06, p<0.002, and 
smallest F’(l,13)=20.21,p<0.001). There was no interaction between Spatial 
Frequency and Task Load, or Spatial Frequency, Task Load, and Channels. 
In other words, these exogenous effects were independent of task load, and 
thus replicable across blocks. 

At the midline leads (Fz, Cz, Pz), there was a Spatial Frequency x 
Channels interaction from 60 to 120 milliseconds (smallest F(2,12)=10.15, 
/><0.003), which reflects the scalp distribution of the spatial frequency effect 
(posteriorly negative, anteriorly positive). The interaction was followed by a 
main effect of Spatial Frequency from 140 to 160 milliseconds (a negativity, 
F( 1,13)= 13. 09, /»<0.004), which was followed by another interaction with 
Channels from 180 to 200 milliseconds (posteriorly positive, anteriorly 
negative, F(2,12)=7.14, p<0.01). There was no interaction between Spatial 
Frequency and Task Load, or Spatial Frequency, Task Load, and Channels, 
again stressing the robustness of these exogenous effects to both task load 
and replication. 

Figure 4 displays the differences between the responses to standard and 
deviant gratings, pooled over spatial frequencies, separately for each task 
load condition. At the posterior channels, there was a Deviance x Channels 
interaction from 60 to 160 milliseconds (smallest F(2,12)=7.20, /><0.009), 
which was overlapped and followed by a Deviance main effect from 120 to 
200 milliseconds (smallest F(l,13)=10.61, jo<0.007). This sequence of 
effects seemed due to the fact that (Figure 4) the deviance effect was 
negative at Oz and slightly positive at T5/T6 from 60 to 120 milliseconds, 
negative and larger at Oz than at T5/T6 from 120 to 160 milliseconds, and 
negative at all posterior electrodes from 160 to 200 milliseconds. There was 
no significant interaction between Deviance and Task Load, or Deviance, 
Task Load, and Channels, showing that these effects were independent of 
the difficulty of the visual tracking task. In addition, there was no interaction 
between Deviance and Spatial Frequency, but there was a significant 
Deviance x Spatial Frequency x Channels interaction between 80 and 100 
milliseconds (F(2,12)=17.46,/?<0.001), indicating that the earliest part of the 
deviance effect was a modulation of the scalp topographies of the Cl 
responses to the two spatial frequencies. Indeed, this effect vanished after 
normalizing the scalp distributions (F(2,12)=0.63, ^?<0.55), so that it could 
be attributed to an amplitude modulation of the Cl response in particular to 
high spatial frequency gratings by stimulus deviance (also see Figure 5). 
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Figure 4. Grand average differences between responses evoked by standard and deviant 
spatial frequencies at three levels of task difficulty (n=14). 

At the midline leads, there was a Deviance x Channels interaction from 
120 to 140 milliseconds (F(2,12)=10.05, /><0.003), a trend towards a Task 
Load X Deviance interaction from 120 to 180 milliseconds (smallest 
F(2,12)=5.83,/?<0.018), and a Task Load x Deviance x Channels interaction 
from 140 to 180 milliseconds (smallest F(4,10)=6.05, /?<0.01), which was 
preceded by a trend from 120 to 140 milliseconds (F(4,10)=5.77, /><0.012). 




52 



Chapter 3 



Thus, as illustrated in Figure 4, deviant stimuli evoked a positive response 
between 120 and 180 milliseconds, which was larger in the easier task 
conditions and larger at frontal than at parietal scalp sites. 

To summarize the results so far, there was (1) a Spatial Frequency effect, 
largest at Oz from 60 to 100 milliseconds, which was independent of Task 
Load, but amplified by Deviance; (2) a second Spatial Frequency effect 
between about 140 and 200 milliseconds, which was independent of both 
Task Load and Deviance; (3) a posterior Deviance effect (120-200 
milliseconds), which was independent of Spatial Frequency and Task Load; 
and (4) an anterior Deviance effect (120-180 milliseconds), which was larger 
for easier lask Load conditions. Thus, the posterior and anterior deviance 
effects were dissociated by their differential dependence on task load. In 
order to further clarify these effects, mean amplitudes between 60 and 100 
milliseconds were computed at Oz, and between 120 and 180 milliseconds at 
Fz and Oz. 



Deviance Effects: 60-100 ms 



Oz 



-o High Spatial Frequency 




Figure 5. Means (+SEM) for the deviance effect at Oz between 60 and 100 milliseconds 
(n=14). These points reflect the difference between amplitudes to standard and deviant spatial 
frequencies, separately for each spatial frequency and task load condition. 

Figure 5 shows the deviance effect (i.e., the difference between responses 
evoked by standard and deviant gratings) at Oz in the time interval from 60 
to 100 milliseconds, separately for each task load and spatial frequency. A 
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Spatial Frequency x Deviance x Task Load MANOVA revealed a main 
effect of Deviance (F(l,13)=5.86, /)<0.031) and a Spatial Frequency x 
Deviance interaction (F( 1,13)= 11.60, /?<0.005), but no Task Load effects. 
The interaction indicated that, irrespective of task load, only responses to 
high spatial frequencies were affected by deviance: Their Cl response was 
significantly larger when they were deviant than when they were standard 
(mean difference -0.65 pV, SE= 0.12; F(l,13)=16.96,/7<0.002). In contrast, 
the low spatial frequency response in the same time interval was not affected 
by stimulus deviance (mean difference +0.06 pV, SE= 0.11; F(l,13)=0.18, 
p<0.68). Although Figure 5 suggests a modulation by task load of the 
deviance effect for the low spatial frequencies, this effect failed to reach 
significance when tested separately (F(2,12)=2.94, /><0.10). 

Figure 6 shows the same differences (i.e., deviant-standard spatial 
frequencies) for the time interval from 120 to 180 milliseconds at Fz (upper 
panel) and Oz (lower panel). At Oz there was a main effect of Deviance 
(F(l,13)=84.88, /?<0.001), which did not depend on Task Load or Spatial 
Frequency. Thus, the magnitudes of the deviance effects were not 
significantly different for the two spatial frequencies (low spatial frequency: 
mean difference -1.33 pV, SE= 0.16; high spatial frequency: mean 
difference -0.86 pV, SE= 0.12; test on difference: F’(l,13)=2.22,p<0.16). 

At Fz, there was a main effect of Deviance (F( 1,13)= 13. 94, p<0.003), 
and an interaction between Deviance and Task Load (F(2,12)=6.49, 
/?<0.013). However, contrary to what is suggested by Figure 6, no effect 
involving Spatial Frequency was significant. However, since the deviance 
effect clearly seemed larger for low spatial frequencies in the "easy" task 
load condition, we ran separate Deviance x Task Load MANOVAs for each 
spatial frequency. For the high spatial frequency, there was a main effect of 
Deviance (F(l,13)=8.15,/?<0.014), but no effects of Task Load. For the low 
spatial frequency, there were significant effects of Deviance (F(l,13)=9.83, 
/»<0.008), Task Load (F(2,12)=20.30, /><0.002), and their interaction 
(F(2,12)=5.19, /><0.024). The interaction indicated that the deviance effect 
was larger for low spatial frequencies in the easier task load conditions. 
Hence, the absence of interactions with spatial frequency in the three-way 
analysis was due to the fact that the modulation of the deviance effect by 
spatial frequency occurred only in the "easy" task load condition to mitigate 
the statistical outcomes. As a check on chance capitalization, additional 
analyses were performed on the Oz data in the bottom panel of Figure 6 that 
confirmed the earlier results (i.e., for both the high and low spatial 
frequency, the main effect of Deviance was significant, but the Task Oz 
Load and Deviance x Task Load effects were not). Thus, the differential 
spatial frequency effects for the Fz data found in the separate analyses 
appear to be genuine effects and not Type I errors, and the absence of the 




54 Chapter 3 

interaction in the three-way analysis due to low statistical power (a Type II 
error). 



Deviance Effects: 120-180 ms 
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Figure 6. Deviance effects at Fz (upper panel) and Oz (lower panel) between 120 and 180 
milliseconds. Shown are the means and standard errors of the difference between responses to 
standard and deviant spatial frequencies, separately for each spatial frequency and task load 
condition (n=14). 
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4. DISCUSSION 

Task-irrelevant gratings with either low or high spatial frequencies were 
flashed to the upper and lower part of a computer screen while subjects 
performed a continuous compensatory tracking task at three levels of 
difficulty. In half of the blocks, the low spatial frequency grating was 
standard and the high spatial frequency grating was deviant, in the other half 
this mapping was reversed. This design isolated effects related to the 
physical difference between the stimuli, effects related to the deviance of the 
stimuli, and effects related to interactions and modulations. Such a design is 
necessary if effects of infrequent presentation probability are to be 
interpreted as true deviance effects, without the confounding effects of 
different exogenous stimulus parameters. The logic and the design 
requirements here are identical to those in studies of voluntary attention, 
with stimulus relevance replaced by stimulus deviance (Heslenfeld et al., 
1997). 

The performance data showed that the three difficulty levels of the 
compensatory tracking task had the intended effects; keeping the moving 
rectangle in the middle of the screen was significantly easier in the "easy" 
condition and significantly more difficult in the "difficult" condition than it 
was in the "moderate" condition. In addition and at the more difficult levels 
only, frequent stimulation with low spatial frequencies was more interfering 
with the tracking task than frequent stimulation with high spatial 
frequencies. This result is highly consonant with evidence showing that 
lower spatial frequencies are more capable of interrupting and interfering 
with ongoing visual processing than higher spatial frequencies. This 
evidence stems from neurophysiological and psychophysical data showing 
that lower spatial frequencies are preferably transmitted through transient 
visual channels that are thought to have larger inhibitory effects than the 
more sustained visual channels, which preferably transmit higher spatial 
frequencies (cf. Blum, 1991; Breitmeyer & Ganz, 1976; Hughes, 1986; 
Hughes et al., 1996). 

With respect to the evoked potentials, high spatial frequency gratings 
evoked a much larger early Cl component (60-100 milliseconds) than low 
spatial frequency gratings, which is in line with the well-known effects of 
spatial frequency on this component (Smith & Jeffreys, 1978; Spekreijse et 
al., 1973). This difference was unaffected by task load, which shows that 
visual task difficulty did not alter the early exogenous representations of 
these stimuli. This result is important because it excludes a number of trivial 
factors (such as differences in fixation, accommodation, arousal, etc.) as 
possible causes of other effects involving task load. 
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However, a significant effect of stimulus deviance on the Cl component 
to high spatial frequency gratings was observed (see Figure 5). Deviant high 
spatial frequencies evoked a larger Cl response than standard high spatial 
frequencies. The latency, scalp distribution, and spatial frequency specificity 
of this effect clearly links it to the exogenous Cl. It can be interpreted as a 
saturation of the response to high spatial frequency gratings when these 
stimuli were frequently presented, as opposed to a full response when these 
stimuli were deviant. This result is again very much in line with the different 
temporal characteristics of transient and sustained visual channels. Sustained 
channels, transmitting higher spatial frequencies, are thought to have slower 
conduction velocities and longer recovery times than transient channels 
(Breitmeyer & Ganz, 1976; Stone, 1983; cf. Brannan, 1992). Thus, 
responses to high spatial frequency gratings may saturate earlier than 
responses to low spatial frequency gratings when the respective stimuli are 
more frequently presented. 

This saturation effect was unaffected by task load, which again 
demonstrates the immunity of these early exogenous responses to the 
difficulty of the overt visual tracking task, as well as their replicability 
across task blocks. This finding further strengthens the interpretation of 
these effects as reflecting purely exogenous processes and again excludes 
trivial differences in fixation or arousal as possible causes of other task load 
effects. 

At frontal and central scalp sites, a positive response to deviant stimuli 
(120-180 milliseconds) was found, which depended on both task load and 
spatial frequency. That is, the response was larger to low spatial frequency 
deviants in the "easy" task load condition. The spatial frequency dependence 
of this response is again in line with the different functional characteristics 
of transient and sustained visual channels. Transient channels, transmitting 
lower spatial frequencies, are ascribed larger interruptive and attention- 
capturing powers than sustained channels (Brannan, 1992; Breitmeyer & 
Ganz, 1976; Hughes et al., 1996), which may have led to an enhanced or 
further processing of the low spatial frequency deviants. This effect may also 
be related to the frontal component of the auditory MMN (Giard et al., 
1990), which is thought to reflect an involuntary shift of attention towards 
unexpected, deviant input (NSStanen, 1992; see also Alho et al., 1998). As 
the present data suggest, this shift may be stronger (or may occur more 
frequently) the more processing resources available, and the more salient the 
unexpected deviant input. 

At posterior scalp sites, a negative response to deviant stimuli (120-200 
milliseconds) was obtained, which was dissociated fi’om the frontal effect by 
its independence of both spatial frequency and task load. This response was 
initially largest at the occipital midline (Oz, 120 to about 160 milliseconds). 
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and later largest at lateral occipito-temporal sites (T5/T6, from about 160 to 
200 milliseconds; see Figure 4), which shows that it consisted of two 
separable sets of generators. However, both were independent of task load 
and spatial frequency, such that both were independent of the concurrent 
visual tracking task difficulty and of the evoking deviant stimulus physical 
characteristics. Both were also independent of the possibly reduced quality 
of the exogenous stimulus representation as indicated by the attenuation of 
the earlier Cl response specifically to high spatial frequency gratings. Thus, 
both responses may be interpreted as reflections of endogenous (feature- 
independent) and automatic (attention-independent) processing of deviant 
visual input. As such, they may be interpreted as the visual analogues of the 
auditory mismatch negativity (MMN). 

The present visual MMN differs from the auditory MMN in that the 
former seems to be completely independent of the features of the evoking 
stimulus. In the auditory modality, the orientations of the generators in the 
upper temporal cortex have been reported to depend on the pitch of the 
evoking deviant tone (Tiitinen et al., 1993). These findings imply that the 
auditory MMN may be less endogenous than the visual MMN. However, the 
feature dependence of the auditory MMN is not firmly established, because 
this effect (a) was apparently not replicated in a subsequent study (Tiitinen et 
al., 1994), and (b) may have been caused by a summation of highly feature- 
specific N1 generators and less feature-specific MMN generators (Alho, 
1995). Future research in both modalities is needed to clarify the exact 
extent to which both MMNs are truly independent of the features of the 
evoking stimulus, as well as resistant to more severe withdrawals of 
modality-specific attentional resources. 
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New auditory information usually appears as a change in some parameter 
of the acoustic input. Detecting and processing change are, therefore, 
important functions of the human auditory system. Studying how the human 
brain processes acoustic change has a long tradition in psychology (e.g., 
James, 1890). However, in complex natural auditory environments, it is not 
easy to determine what constitutes change. In the present review, “change” 
will be regarded as deviation from an aspect of the acoustic environment that 
has been registered by the system as regular input. This definition of change 
includes the notion that for any given system, including living organisms, 
change always corresponds to some previously registered regularity. 
Violation of a regularity that was not identified as such by a given system 
does not constitute change (for this system). For example, the human 
auditory system cannot usually detect long, periodically repeating sound 
patterns due to capacity limitations of the auditory sensory memory store 
(Guttman 8c Julesz, 1963). As a consequence, one cannot detect violations of 
such regularities. Another problem for conceptualizing change in complex 
auditory scenes is that a sound may simultaneously violate several 
regularities. To determine which aspect (or aspects) of this sound activated 
change-related processes, one must study what kind of auditory regularities 
were represented in the brain. Often, when the experimenter presents an 
auditory stimulus sequence to subjects and measures responses elicited by 
acoustic change, the contents of the regularity underlying the change-related 
response seem to be obvious to the experimenter since these experimental 
situations are usually very simple. However, the human brain is geared to 
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deal with much more complex auditory environments and, therefore, might 
treat the sequence somewhat differently from what the experimenter expects. 

The goal of this review is to show that the processes involved in 
detecting changes in natural auditory environments can be assessed by using 
relatively simple paradigms. The evidence suggests that even in such 
simplified acoustic situations the human auditory system goes beyond what 
seems “obvious” to the experimenter in terms of identifying and representing 
regularities. The discussion focuses on capabilities of stimulus-driven 
(bottom-up) processing of auditory change and is based primarily on 
neuroimaging methods and especially event-related brain potentials (ERP), 
which have provided important insights into stimulus-driven processing of 
auditory information. 



1 . ERP MEASURES OF STIMULUS-DRIVEN 
CHANGE DETECTION 

As an initial paradigmatic example of methods to study change detection, 
subjects are presented with a sound sequence consisting of a single repeating 
tone (termed the standard). Occasionally, the standard is exchanged for a 
different tone (termed the deviant), for example, one having a higher 
frequency — a task procedure often called the oddball paradigm. It has been 
known for more than 20 years that the deviant stimulus elicits an ERP 
component, termed the mismatch negativity (MMN, NSatanen, Gaillard, & 
Mantysalo, 1978; for a review, see NaStanen & Winkler, 1999) whether or 
not the subject’s task requires attention to be focused on the auditory stimuli. 
Highly distinguishable (salient) deviant sounds also elicit another ERP 
component termed P3a (Squires et al., 1975; for reviews, see Knight & 
Scabini, 1998; Polich, 1998).' As neither MMN nor P3a are elicited by 
regular or repetitive sounds, these components appear to index the neural 
activity involved in processing stimulus change. 

It is important to note that MMN and P3a reflect different processes and 
can be dissociated from each other. MMN is not always followed by a P3a, 
but MMN is always followed by a P3a when the deviant stimulus is salient 
in the given auditory environment (Lyytinen et al., 1992). Moreover, even 
when both MMN and P3a are elicited in an auditory oddball paradigm, the 
amplitudes of the two components do not always vary together (Winkler et 
al., 1998). P3a can also be elicited without a corresponding MMN. An 
example that would elicit this component would be a loud or otherwise 
salient sound presented after a long silent period (Woods, 1990). However, 
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only sounds deviating from some previously established non-silent auditory 
regularity elicit the MMN. Based on results from a large number of studies, 
current theories suggest that, whereas MMN is likely involved in processing 
auditory change within the large-capacity, stimulus-driven processing 
system (NaatSnen, 1990, 1992; NaStanen & Winkler, 1999; Winkler et al., 
1996), P3a might reflect the redirection of attention to a stimulus that was 
encountered outside the focus of attention or “passive attention” (James, 
1890; see also, NSatanen, 1990, 1992; Ohman, 1979). 

The oddball paradigm is a model of changes that occur in natural 
acoustic situations. However, even though quite simple in structure, one 
might question whether it is necessary to assume the existence of a system 
pre-attentively representing auditory regularities as the basis of MMN 
elicitation. An alternative account suggests that the frequent repetition of the 
standard stimulus creates an unnaturally sharp sensory memory trace of this 
sound in the auditory system. MMN is elicited either by an automatic 
process comparing the trace of the deviant sound with the standard-stimulus 
memory trace (Ritter et al., 1995), or by the process that forms the sensory 
memory trace of the deviant sound in the presence of a strong standard- 
stimulus trace (NaatSnen, 1984). 

Studies showing that MMN can be elicited without repeating any given 
sound, argue against this “strong memory trace” explanation of MMN. 
Sound repetition is not a necessary prerequisite of the change detection 
process indexed by the MMN. Tervaniemi, Maury, and Naatanen (1994) 
presented tone sequences that were either ascending or descending in 
frequency. Occasional tones the frequency of which did not fit into the trend 
elicited the MMN. This result showed that the regularity extracted from 
sequences was based on the direction of frequency change between 
successive tones, not on the repetition of some sound (Tervaniemi et al., 
1994). Other paradigms established that abstract rules can also provide a 
basis for the change detection reflected by the MMN (Paavilainen et al., 
1995, 1999; Saarinen et al., 1992). For example, when “the higher the 
frequency the lower the intensity” rule applied to the majority of the tones in 
a sequence, occasional deviants with low frequency and low intensity (or 
high frequency and high intensity) elicited the MMN (Paavilainen et al., 
2001). This result again cannot be explained solely on the basis of auditory 
sensory memory traces of some sounds. 

Furthermore, MMN may not be elicited by deviants following a long (12 
seconds) silent interval, even when subjects are able to discriminate the 
standard and deviant tones across this time period (Berti et al., 2000). This 
observation can be linked with those showing that MMN is not elicited by a 
change between two sounds without the first sound being repeated a few 
times, although the sensory memory representations are sufficiently clear for 
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voluntary discrimination of the sounds (Cowan et al., 1993). Instead, the 
same sound must be presented a few (minimally 3) times for a subsequent 
different sound to elicit the MMN (Cowan et al., 1993; Horvdth et al., 2001; 
Schrdger, 1997; Winkler et al., 1996a and b). That is, the sensory 
representation of an auditory stimulus can only serve as the “standard” for 
the MMN-generating change detection process if this stimulus is repeated 
prior to (but not long before) the “deviant” sound. 

The results reviewed above strongly support the notion that the change- 
related processes reflected by the MMN component are based on 
representations of the current auditory regularities, rather than on strong 
sensory memory traces of a repeating sound (Winkler et al., 1996b). Even in 
the case of a single repeating tone, as in the auditory oddball paradigm, the 
“standard” underlying change detection is a regularity (the repetition of a 
tone, including the auditory representation of the tone itself), not just the 
auditory sensory memory trace of the repeating tone. 

Because the auditory oddball paradigm and the most commonly used 
methods of sound presentation neglect so many aspects of natural situations, 
the question arises as to whether MMN can be elicited in complex natural 
acoustic environments or whether it is the product of an artificially 
constrained stimulation procedure used in the experiments in which the 
MMN component was originally discovered. In other words, does MMN 
reflect an important part of the change-detection system processing new 
information in everyday life, or is it more or less a laboratory artifact?^ The 
following sections will discuss the features of natural auditory scenes that 
are missing from the typical oddball paradigm, and will review results of 
experiments that were designed to model these natural features. 



2. STIMULUS PRESENTATION 

There are two aspects of auditory stimulus presentation used in most ERP 
experiments, which neglect important characteristics of natural acoustic 
environments: sound complexity and free-field sound delivery. Most studies 
present simple sinusoid tone bursts, whereas sounds encountered in natural 
situations are usually more complex. Addressing this issue, studies of 
harmonic tones (Tervaniemi et al., 2000; Winkler et al., 1997), chords (Alho 
et al., 1996), synthesized and digitized natural speech sounds (e.g., Sams et 
al., 1990; Sandridge, & Boothroyd, 1996), noise bursts (Nordby et al., 1994), 
and other complex sounds (Sams & Naatanen, 1991; Winkler et al., 1998) 
have shown that MMN as well as the P3a are elicited (in fact they might 




CHANGE DETECTION IN AUDITORY ENVIRONMENT 



65 



even be larger in amplitude, see, e.g., Csepe & Molndr, 1997; Tervaniemi et 
al., 2000) when complex sounds are used. One should note that natural 
MMN-eliciting sounds often also elicit P3a and natural (environmental or 
complex harmonic) sounds embedded in sequences of simple tone bursts 
invariably evoke this component (Escera et al., 2000). 

General ERP research practice includes the use of headphones to deliver 
sound, which permits fine control over auditory parameters. However, this 
technique is not a necessary prerequisite of MMN or P3a elicitation, as a 
number of studies have demonstrated the elicitation of change-related ERP 
components using stimuli presented via loudspeakers (e.g., Paavilainen et al., 
1989; Winkler et al., 1998). Thus, the auditory change ERP components are 
elicited when sounds are presented in a natural way. 



3. ACOUSTIC VARIANCE 

The most obvious abstraction of the oddball paradigm (compared to 
natural situations), is that the standard stimuli are identical sounds presented 
isochronously. Very few natural sound sources occur this way. Even with 
such a sound source, one would still have to assume a total lack of variance 
in the position of the listener as well as all other objects of the given 
environment that could affect acoustic reflections. Therefore, if the change- 
related processes reflected by MMN and P3a are natural phenomena, they 
should tolerate acoustic variance. 

Figure 1 provides illustrative examples of this assertion. Winkler et al., 
(1990) found that MMN is elicited when one tonal feature (intensity) of the 
standard stimulus was varied. Deviants differing from the standard in either 
the varying feature intensity (Figure la) or another feature (frequency. 
Figure lb) elicited MMN. The MMN’s tolerance to acoustic variance has 
been demonstrated with infrequent changes in a constant sound feature when 
several other features vary (Houtilainen et al., 1993; Gomes et al., 1995). As 
outlined above, several studies demonstrated that abstract regularities, such 
as frequency ascension (or descension), can be pre-attentively extracted from 
the auditory input (Paavilainen et al., 1995, 1999, 2001; Saarinen et al., 
1992). This means that the auditory regularity representation system 
underlying MMN generation tolerates acoustic variance and can use it to 
extract higher-order regularities. These findings suggest a primitive sensory 
intelligence that is active even when the sounds lie outside the focus of 
attention. 

Another source of acoustic variance stems from the timing of sound 
delivery. Although, most experimental designs used uniform sound 
presentation rates, constancy of stimulus delivery rate is not necessary for 
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MMN elicitation in the oddball paradigm (e.g., BSttcher-Gandor & 
Ullsperger, 1992). In addition, Horvath et al. (2001) showed that 
infrequently violating the alternation of two tones differing in frequency 
resulted in MMN elicitation even when the inter-stimulus interval between 
successive tones was varied. In sum, the change related processes reflected 
by the MMN response tolerate, or can even utilize variability in spectral as 
well as temporal acoustic parameters. 



Inlenslly condlllon 
Chann*li Fz 



Frequency condlllon 
Channel! Fz 



Standard 

(dB) 



80 



804/-0.8 



80V- 1.6 



80V-3.2 



80V-6.4 




Figure 1. Grand-average (10 subjects) frontal (Fz) MMN responses elicited by intensity- 
deviant (70 dB/600 Hz, Figure la) and frequency-deviant (80 dB/650 Hz, Figure lb) tones 
(p=0.10) presented amongst varying-intensity 600 Hz standard tones (p=0.90). The five levels 
of intensity variation (no, ±0.8, ±1.6, ±3.2, and ±6.4 dB) were tested in separate blocks using 
9 equidistant intensity steps in each stimulus block (except in the no variance block). The 9 
intensity variants of the standard tone (“substandards”) were delivered with equal (p=0.10) 
probability. Responses to the standard (averaged across all different intensity levels; thin line) 
and deviant tones (thick line) are presented on the left side of the figure. Differences between 
the responses to the deviant and standard tones (averaged across all intensity variants; thick 
line) and those between the deviant and the 80 dB substandard tones (thin line) are shown on 
the right side of the figure. Different levels of intensity variation are shown as rows, 
separately for the intensity (A) and frequency (B) conditions (after Winkler et al., 1990). 



4. TEMPORAL PATTERNS VS. SINGLE SOUNDS 

Natural sound sources usually emit patterns of sounds rather than single 
stimuli. This is another important feature in which the typical oddball 




CHANGE DETECTION IN AUDITORY ENVIRONMENT 



67 



paradigm simplifies natural situations. The acoustic regularities inherent in 
speech, music, and in most environmental sounds (e.g., the sound of the 
hooves of a galloping horse), can be better described in terms of temporal 
sound patterns, rather than in terms of individual sounds. The repetitive or 
otherwise regular feature of the sound sequence is based on segments 
consisting of a number of sound elements (e.g., the sequence of steps made 
up by the galloping horse), which may or may not be separated by silent 
intervals. For the regularity representation system to function efficiently in 
natural environments, the perceptual “units” and their regularities must be 
detected together as these often define the unit itself (Port, 1991). 

Schroger and his colleagues (Schrdger, 1994; Schroger, NSStanen, & 
Paavilainen, 1992; Schroger, Paavilainen, & NSStanen, 1994) demonstrated 
that MMN is elicited by occasionally changing the frequency of one tonal 
segment of a concatenated tonal pattern, consisting of 6-8 segments within a 
repetitive series of this tonal pattern, whether or not the tonal patterns were 
separated by silent intervals. Occasionally changing the frequency of a tonal 
segment introduces deviant frequency transitions at the borders of the 
deviant segment (even if the frequency of the deviant segment is equal to 
that of one of the other segments). Therefore, these results could also be 
explained by assuming that the auditory system encodes frequency changes 
between consecutive sounds (as was later confirmed by Paavilainen et al., 
1999). To verify that MMN can be elicited by violating regularities based on 
tonal patterns rather than regularities based on individual tones or 
relationships between consecutive tones, Winkler and SchrOger (1995) 
assessed whether MMN is elicited when two segments of identical frequency 
but different durations are exchanged. Furthermore, as illustrated in Figure 
2a, the tonal patterns were presented under four different conditions: 1) as a 
sequence consisting of a discrete repeating tonal pattern, with consecutive 
patterns being separated by silent intervals, 2) as a periodic sequence 
repeating the same tonal pattern with no silent interval between consecutive 
patterns (two versions differing only in the duration of the tonal segments), 
and 3) as a sequence of four discrete tones repeating periodically with the 
same amount of silence separating consecutive tones and consecutive cycles 
of the four tones. Figure 2b indicates that MMN was obtained in all of these 
conditions, demonstrating that the regularity representations involved in the 
MMN-generating process can encode sound sequences in terms of patterns. 
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Figure 2. ERP responses to occasional violations of repeating tonal patterns. In each 
condition, two patterns of 4 tonal segments were used (Figure 2a). Pattern B was obtained by 
exchanging segments 2 and 4 of pattern A and vice verse. The exchanged tonal segments 
were identical in frequency but differed in duration. In half of the stimulus blocks of each 
condition, one pattern served as standard (p=0.90), the other as deviant (p=0.10). In the other 
half of the stimulus blocks the roles of the two patterns were exchanged. Timing and 
frequency parameters are shown on Figure 2a. Grand-average (12 subjects) frontal (Fz), 
central (Cz), and parietal (Pz) responses were calculated by subtracting the response to the 
standard pattern from the response to the same pattern when it served as the deviant (Figure 
2b). The difference responses for the two patterns (A and B) were collapsed. The “reference” 
time point (marked on Figure 2b) was set to the onset of the difference between the standard 
and deviant patterns (i.e., the latency at which the shorter of the two exchanged segments 
ended). The tonal patterns (Figure 2a) and ERP difference responses (Figure 2b) for the 4 
condition are shown in separate rows (after Winker et al., 1995). 

Winkler and Schrdger’s (1995) results obtained in the separate-tone 
condition (item 3 above) bring up the question: When does the auditory 
system represent a sound sequence in terms of individual sounds and when 
does it do so in terms of sound patterns? What rules govern the formation of 
patterns? Scherg et al. (1989) reported that regular repetition of a pattern of 
tones is not sufficient in itself by presenting periodically repeated 5 tone 
sequences with 4 identical tones followed by a different one (SSSSD) and 
the inter-stimulus interval (ISI) being the same within and between patterns. 
MMN was elicited by the pattern-ending D tones just as it was when the S 
and D tones were presented in a randomized order with the same 
probabilities (0.80-0.20, respectively). These results suggested that in both 
conditions (regular and randomized-order presentation), the regularity 
encoded by the system underlying the MMN-generating process was the 
repetition of the S tone. D tones violated this regularity, thus eliciting MMN. 
If the regular sequence had been represented in terms of the repeating 5 -tone 
pattern, the pattern-ending D tone should not have elicited MMN, as it 
would have been encoded as part of the regularity (i.e., a part of the 
repeating 5-tone standard pattern). Sussman et al. (1998a) hypothesized that 
the relatively slow presentation rate used in the regular-presentation 
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condition (1.3 second stimulus onset asynchrony [SOA]) by Scherg et al. 
(1989) prevented pre-attentive detection of pattern repetition, since one full 
cycle pattern lasted 6.5 seconds. Hence, as three presentations of a pattern is 
the minimum required for establishing a regularity for MMN (Cowan et al., 
1993; Schroger, 1997; Winkler et al., 1996b), these long auditory patterns 
exceeded by far the estimated temporal span of auditory sensory memory 
(ca. 10 seconds, Cowan, 1984). In contrast, in the separate-tone condition of 
Winkler and Schrdger’s (1995) study, one cycle of the tones lasted only 
860 milliseconds. Sussman et al. (1998a) speeded up the rate of stimulus 
delivery in Scherg et al.’s paradigm to 100 milliseconds SOA (resulting in a 
500 milliseconds cycle for the tone pattern) and found that the D tones no 
longer elicited MMN. However, the D tones did elicit MMN in the 
comparable randomized-order condition, so that the repetition of a tone 
pattern can only be detected without focused attention if the cycles are 
sufficiently short.^ Thus, even though the capacity of the pre-attentive 
system for detecting temporal sound patterns might be limited compared to 
what can be attentively processed, stimulus-driven processing seems to be 
well prepared to deal with regularities based on sound patterns, as is required 
for detecting changes in natural situations. 



5. NON-REPETITIVE REGULARITIES 

Regularities that may apply to a given source are not restricted to 
repetition of a sound or a sound pattern. This is another important feature of 
natural environments that is not modeled by the oddball paradigm. Smooth 
transitions in location or pitch are usually parts of the regularities, whereas 
abrupt changes may signal outstanding auditory events. A change detection 
system working under everyday conditions must be able to separate smooth 
transitions and encode them as regular features, so that the transitions are not 
confused with violations of non-repetitive regularities, like those found with 
feature trends (pitch: Tervaniemi et al., 1994; virtual movement in space: 
Winkler et al., in preparation), abstract relationships (Horvath et al., 2001; 
Paavilainen et al., 1995, 1999, 2001; Saarinen et al., 1992), and even 
repeating one sound in an ever-changing sequence of tones (Horvath et al., 
2001; Wolff & SchrSger, 2001) elicit MMNs. These results demonstrate that 
the processes underlying the MMN component can encode transition-based 
and higher-order (non-repetitive) regularities, separating them from irregular 
changes in sound features (which are detected as regularity violations). 
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6. TEMPORAL DYNAMICS AND MULTIPLE 
REPRESENTATIONS OF REGULARITIES 

In real life situations, the nature of what is regular might change in time. 
Some aspects of the stimulation that have been regular for some time may 
cease to be so or change their characteristics, while other aspects may 
become regular. Several changes may occur together, but they can also 
happen separately. Therefore, the human auditory change detection system 
must be able to dynamically adapt its regularity representations by 
eliminating outdated regularity representations and building up new ones. 

Winkler et al. (1996b) studied the course of the MMN while the 
regularity of the sound sequence changed from one repetitive tone to 
another. Short trains of tones were presented to the subject. Each train 
started with 6 long tones (450 milliseconds in duration), establishing the 
long tone as a repetitive regularity. The long tones were followed by 2, 4, or 
6 short tones (150 milliseconds in duration), or no short tones at all, as seen 
in Figure 3a. Each train ended with a medium-duration (300 milliseconds 
long) probe tone. Because the peak latency of the MMN response follows 
between 100-200 milliseconds from the time when deviation from the 
regularity commences, the latency of the MMN response elicited by the 
probe tone revealed which of the two possible standards (short or long) was 
active at the time the probe tone was presented. A MMN elicited by the 
long-tone standard could be expected to peak between 400 and 500 
milliseconds from the onset of the probe tone, because the probe tone started 
to differ from the long tone at its offset (300 milliseconds). In contrast, the 
probe tone started to differ from the short tone at the offset of the short tone 
(150 milliseconds). Therefore, a MMN elicited by the probe with respect to 
the short tone could be expected to peak between 250 and 350 milliseconds 
from the onset of the probe. Figure 3b illustrates results from the probe 
elicited MMN with respect to the long-tone standard under three different 
circumstances: when no, 2 or 4 short standards intervened between the last 
long tone and the probe. The probe elicited MMN with respect to the short- 
tone standard with 4 or 6 short tones preceding it. When the 6 long tones 
were followed by 4 short ones, the probe elicited two MMNs, one with 
respect to each of the two standards. Naturally, the first few (2 or 3) short 
tones following the 6 long ones also elicit MMN with respect to the 
long-tone standard. These results demonstrated that the auditory system 
closely follows the change from one repeating tone to another. 
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Figure 3. ERP responses elicited by medium-duration (300 milliseconds) probe tones 
following a change of the repeating (standard) stimulus. Four different short trains (Figure 3a) 
were equiprobably mixed together within the stimulus blocks. Trains were separated by the 
same interval (800 milliseconds onset-to-onset delay) as consecutive tones within the trains. 
Each train started with 6 long-duration (450 milliseconds) tones and ended by the probe tone. 
Trains differed in the number of short-duration (150 milliseconds) tones intervening between 
the last long-duration tone and the probe: either no, 2, 4, or 6 short tones were delivered 
(Figure 3a). The 3 types of tones differed only in their duration; other parameters (frequency 
1000 Hz, intensity 80 dB) were identical for all tones. Figure 3b (left side) presents the grand- 
average (8 subjects) frontal (Fz) responses to probe (thick line) and identical reference tones 
(thin line). The reference response was obtained in a stimulus separate block in which the 
probe tone was presented alone with the stimulus delivery rate used in the other blocks. The 
right side of Figure 3b shows the probe-minus-reference difference responses. Since the 
MMN response usually follows by 100-200 milliseconds the moment when the difference 
between the deviant (probe) and the “standard” stimulus commences, the early frontally 
negative difference (peaking 250 and 350 milliseconds; marked by gray shading on the left 
side of Figure 3b) reflects an MMN elicited by the probe with respect to the short-duration 
standard stimulus. The late difference (peaking between 400 and 500 milliseconds; marked by 
black shading on the left side of Figure 3b) reflects an MMN elicited by the probe with 
respect to the long-duration standard stimulus. The long- and short-duration standards and the 
probe tone are marked on Figure 3b to help relating the MMN components to the deviance 
which elicited them. Responses elicited in different trains are presented in separate rows (after 
Winker et al., 1996). 
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Horvath et al. (2001) extended the notion of the adaptation of regularity 
representations to changing between two different types of regularities. 
These authors tested the transition from tone alternation (ABABAB...) to 
tone repetition (...BBB) and back to alternation. MMN was elicited by 
returning to alternation after repeating a tone twice (. . . ABABABBBA). This 
result suggests that the repetition rule for the B stimulus was established by 
just 3 consecutive presentations of this tone. The repetition rule was violated 
by the return of alternation (tone A). Repetitions of the B sound 
(...ABABABB and ...ABABABBB) also elicit MMN in this situation as 
they violate the preceding alternation rule. Regularly alternating tones (not 
immediately preceded by stimulus repetition) do not, of course, elicit MMN. 
Again, the new regularity (repeating the B tone) was represented soon after 
it appeared. 

Taken together, these findings suggest that the change detection system 
reflected by MMN quickly adapts to the emergence of new regularities. The 
emergence of a new regularity does not in itself erase previous regularities. 
This was shown by the elicitation of two MMNs, one with respect to the 
long and the other with respect to the short-tone standard, by Winkler et al.’s 
(1996b) medium-duration probe tones. The elimination of an “outdated” 
regularity representation is somewhat slower. Several studies have found 
that regularity representations may be retained for quite long periods (>10 
seconds), although they need to be reactivated before MMN can again be 
elicited with respect to them (Cowan et al., 1993; Ritter et al., 1998; Winkler 
et al., 1996a). Fast formation and relatively long retention of regularity 
representations are probably optimal strategies in natural auditory 
environments. 

These characteristics of the pre-attentive regularity representation system 
demand simultaneous maintenance of multiple regularities for the same 
sound sequence. In the studies reviewed above, (Horvath et al., 2001; 
Winkler et al., 1996b) one regularity was exchanged for another. However, 
in many situations, the same sound sequence may have several different 
regular aspects. Even simple tone repetition might be regarded as a set of 
separate repetitive regularities for the different tone features, since 
infrequently violating the constancy of a given sound feature elicits MMN 
when other sound features are randomly varied in the sequence (Gomes et 
al., 1995; Huotilainen et al., 1993; Nousak et al., 1996; Winkler et al., 1990, 
1995). The additivity between MMNs elicited by simultaneous deviance in 
some (but not all) combinations of tonal features (Levanen et al., 1993; 
Schroger, 1995; Takegata et al., 1999, 2001) partly support the view that 
feature repetitions are maintained in parallel to the repetition of the whole 
sound (Ritter et al., 1995). 
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Winkler and Czigler (1998) demonstrated simultaneous maintenance of 
two regularities for the same sound sequence, by occasionally delivering to 
subjects a deviant sound that violated two different auditory regularities of 
the test sequence. A sequence of two alternating tones was presented that 
differed only in frequency. Occasional deviants repeated the frequency of 
the previous tone (thus violating alternation) and were also shorter in 
duration than the standard (alternating) tones. The two successive MMN 
components elicited by these deviants indicated that both tone alternation 
and the constancy of tone duration were represented as regularities at the 
time when the deviant stimulus was delivered (for corroborating evidence, 
see Sussman et al., 1999). Moreover, even a simple rule, such as the 
alternation of two tones (ABABAB...) is simultaneously represented by 
several (redimdant) regularities (Horvath et al., 2001). One rule enabled the 
auditory system to extrapolate from one tone to the next (“local” rule: “A” is 
followed by “B” and “B” is followed by “A”), the other from one tone to the 
tone following the next (“global” rule: every second tone is identical)."* 
These results reveal that the machinery behind auditory change detection is a 
complex system forming, maintaining, and eliminating representations for 
multiple regularities even in seemingly simple auditory scenes (Bregman, 
1990). 



7. MULTIPLE SIMULTANEOUSLY ACTIVE 
SOURCES 

Finally, the auditory oddball paradigm reduces the variance present in 
natural environments by eliminating other sound sources. In real life, one 
seldom encounters situations in which only one sound source is active at a 
time. If human auditory change detection were unable to simultaneously 
handle regularities and deviations of multiple sound sources, it would either 
flood us with signals marking illusory changes (e.g., sounds that are regular 
in their own stream but violate some regularity of another stream) or only 
inform us about gross changes in the acoustic input. 

Sussman et al. (1999) demonstrated that MMN can be made dependent 
on the segregation of two interleaved auditory sequences. These authors 
tested whether the processes of auditory streaming precede MMN 
generation. Auditory streaming is an important form of sound organization 
separating the signals of different sources, which appear together in the 
composite auditory input (Bregman, 1990; van Noorden, 1975). Streaming 
occurs when a sequence of sounds, mixed from two markedly different 




74 



Chapter 4 



sound sets (e.g., high- and low-pitched tones), is presented at a fast pace. 
When it “streams” this sequence is perceived as two independent sound 
streams, and one can detect separate regularities in the two streams. When 
the physical separation between the two sets of sounds is small and/or the 
rate of stimulus delivery is slow, all sounds of the sequence are integrated 
into a single stream and regularities applying separately to the two sets of 
sounds cannot be perceived. Sussman et al. (1999a) showed that MMN 
elicitation follows the same pattern. High and low tones were alternated, 
repeating separate temporal tone patterns within the low and high sequence. 
When the sequence was presented at a slow pace, occasional changes in the 
high or low tone patterns did not elicit MMN. However, when the rate of 
tone delivery was increased, the same pattern violations elicited MMNs. 
Winkler et al., (submitted) showed that MMN elicitation and perception of 
the corresponding within-stream regularity appear together. More important, 
Ritter et al. (2000) have demonstrated that the regularities embedded in 
separate auditory streams are independent of each other. Therefore, deviant 
sounds elicit MMN only with respect to regularities of their own sound 
stream. These results indicate that the change detection processes indexed by 
MMN operate on an organized representation of the auditory input, in which 
multiple sound sources are processed in parallel and while maintaining 
separate independent regularity representations. 



8. FEATURES OF THE CHANGE DETECTION 
SYSTEM REFLECTED BY MMN 

The change detection processes involved in MMN generation can operate 
in complex natural auditory environments. None of those characteristics of 
natural situations (which were not modeled by the auditory oddball 
paradigm) prevent the elicitation of MMN. MMN can be elicited by natural 
sounds presented in open acoustic fields. Variations in sound parameters are 
tolerated, even utilized by the regularity representation system xmderlying 
MMN generation. Temporal sound patterns can be registered as stimulus 
units in detected regularities. Non-repetitive regularities are identified and 
represented. Smooth transitions are identified as regularities, abrupt changes 
as regularity violations. The system follows and adapts to the temporal 
d)oiamics of regularities, usually tracking several regularities 
simultaneously. Signals from different sound sources are processed 
independently of each other. The fact that MMN indexes processes that have 
all these characteristics makes it a very useful tool for investigating auditory 
change detection. 
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9. DISCUSSION 

Although it is probably not true that all the processes up to and including 
MMN would be completely resistant to attentional manipulations, neither do 
they require one’s attention to be focused on the sounds or the engagement 
of voluntary strategies for finding acoustic changes (Naatdnen et al., 1993; 
Ritter et al., 1999; Sussman et al., 1998b; Woldorff et al., 1991). This feature 
of the processes indexed by MMN is very important in everyday life 
situations, since one part of the environment can be attended while 
effortlessly monitoring the rest of the auditory input. The system keeps the 
internal representation up-to-date and produces detection of sudden salient 
changes outside the current focus of attention. The processes creating an 
orderly representation of the auditory environment, segregating sources, 
grouping together sound segments, finding temporal and spectral 
regularities, even operate outside the focus of attention. However, these 
processes are not fully automatic as, at least in some cases, their outcome 
can be affected by top-down processes (Bregman, 1990; Sussman et al., 
1998b). Such processes, which are neither strictly automatic nor fully 
dependent on attention, might be termed “default processes”, analogous to 
those computer functions that do their job even without user interaction, but 
might be altered by entering explicit commands or parameters. 

Change detection in natural auditory environments requires the existence 
of regularity representations. Forming and maintaining these representations 
are part of the processes that organize auditory input. The resulting set of 
analyzed information (containing descriptions of the sources and their 
current emission patterns) can be regarded as an internal model of the 
auditory environment. This model is essential for detecting changes in the 
unattended part of the auditory scene, as well as for selecting parts of the 
auditory scene. Due to the nature of the acoustic modality and the human 
auditory sensory organs, to select a single sound all other temporally 
overlapping sounds must also be delineated from the composite input. 
Otherwise, one would not be able to determine which part of the input 
belongs to the selected source. The fact that attention can be so quickly 
shifted between sources suggests that information about unattended sources 
is readily available. 

The processes generating the MMN might play an important part in 
maintaining the internal model of the auditory environment. When a 
previously registered regularity is violated, two types of actions might be 
necessary: 1) calling for further processing of the new information and 2) 
updating the affected regularities. The first process may lead to a 
stimulus-initiated attention switch, shown by the elicitation of the P3a. The 
second process initiates changes in the internal model. Winkler et al. (1996b) 
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suggested that MMN is generated by this latter process. Winkler and Czigler 
(1998) confirmed this hypothesis by showing that the number of MMNs 
elicited by a single deviant sound corresponds to the number of regularities 
that this sound violated within a short time period. That is, the same deviant 
may elicit one MMN by violating the same regularity twice or two MMNs if 
it violated two different regularities. Therefore, when one studies auditory 
change detection using the MMN component, one probes an essential part of 
the change detection process: the maintenance of acoustic regularities. 



10. SUMMARY 

Processes of stimulus-driven auditory change detection can be studied 
using relatively simple stimulus paradigms. ERP components, such as the 
MMN and P3a, provide useful tools for investigating this important function 
of the human brain. However, the human auditory system is prepared for 
complex natural auditory environments and can do far more than what is 
required by any simplified test situation. 
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NOTES 

1 When the tones are attended, the deviant also elicits N2b (Renault & Lesevre, 1978; Ritter, 
Simson, & Vaughan, 1972; for a review, see Ritter & Ruchkin, 1992) and, if the subject’s 
task is to find the deviant sounds, the P3 component (Sutton et al., 1965; Donchin & Coles, 
1988). 
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2 Of course, there are good reasons to use simple sounds and paradigms in ERP research: to 
reduce the number of experimental variables and the noise of the measurements. 

3 One could of course attentively detect the periodicity of Scherg et al.’s (1989) tone 
sequence. 

4 Although maintaining multiple redundant representations may seem to be wasting 
resources, one should keep in mind that the alternation of two tones is a special case of a 
larger set of alternation regularities. There are alternation type regularities to which only 
one or the other rule may apply. For example, a sequence with a “higher pitch- 
lower pitch -higher pitch- lower pitch...” regular pattern is an alternating sequence, 
which conforms only to the above described “local” rule. Horvdth et al. (2001) showed that 
occasional “higher-higher” and “lower-lower” segments embedded in such a sequence 
elicit MMN. Thus it seems that the pre-attentive auditory processing system is prepared to 
identify many different types of alternations, not only special cases (which are less likely to 
present themselves in natural situations). 
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THEORETICAL OVERVIEW OF P3a AND P3b 



JOHN POUCH 
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Research Institute, La Jolla, CA, USA 



1. P3a AND P3b 

Recent empirical advances on the relationship between the P3a and P3b 
event-related brain potentials (ERPs) have suggested a plausible approach to 
how these potentials may interact. The purpose of this chapter is to review 
the issues surrounding these developments. The chapter is organized into 
several sections: First, the empirical background of the P3a and P3b 
subcomponent distinction is limned. Second, a theoretical perspective of 
P300 is presented. Third, the neuropsychological basis for the P300 
component is outlined in terms of how these subcomponents may be related. 
The goal is to provide a theoretical overview of the topic areas by integrating 
prior findings with current perspectives. 

1.1 Background 

The P300 was discovered over 35 years ago and has provided much 
fundamental information on normal and dysfunctional cognition (Bashore & 
van der Molen, 1991; Sutton et al., 1965). Figure 1 illustrates how this ERP 
component is often elicited by using the “oddball” paradigm in the upper 
panel. Two different stimuli are presented in a random order, and the 
subject is required to discriminate an infrequent target stimulus from the 
frequent standard stimulus by responding covertly or overtly to the target — 
typically a relatively easy discrimination (Picton, 1992; Polich, 1999). The 
target stimulus elicits the P300, which is not apparent in the ERP from the 
standard stimulus. 
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P3a 




Figure 1. Schematic illustration of the oddball (upper panel) and three-stimulus (lower panel) 
paradigms, with the ERPs from the stimuli of each task presented at the right. The oddball 
task presents two different stimuli in a random sequence, with one occurring less frequently 
than the other (target =T, nontarget =N). The three-stimulus task also presents a compelling 
(not necessarily novel) distractor (D) stimulus that occurs infrequently, to which the subject 
does not respond but which elicits the “P3a” subcomponent. In each task, the subject responds 
only to the target stimulus, which elicits the “P3b”. 

The three-stimulus paradigm is a modification of the oddball task in 
which “distractor” stimuli are inserted into the sequence of target and 
standard stimuli. Figure 1 schematically illustrates the task situation in the 
lower panel. When “novel” stimuli (e.g., dog barks, color forms, etc.) are 
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presented as distractors in the series of more “typical” target and standard 
stimuli (e.g., tones, letters of the alphabet, etc.), a P300 component that is 
large over the ffontal/central areas can be produced with auditory, visual, 
and somatosensory stimuli (Courchesne et al., 1984; Knight, 1984; 
Yamaguchi & Knight, 1991b). This “novelty” P300 is sometimes called the 
“P3a” (Courchesne et al., 1975; Squires et al., 1975), and recent analyses 
confirm that these two potentials are the same brain potential (cf. Simons et 
al., 2001; Spencer et al., 1999). The parietal maximum P300 from the target 
stimulus is sometimes called the “P3b”. As the P3a exhibits a ffontal/central 
scalp distribution, relatively short peak latency, and rapidly habituates, it is 
thought to reflect frontal lobe function (Friedman & Simpson, 1994; Knight, 
1997) and can be elicited in a variety of populations (Fabiani et al., 1998; 
Friedman et al., 1998; Yamaguchi & Knight, 1991a). 

Infrequently presented nontarget visual stimuli that are easily recognized 
“typical” (i.e., not novel) also have been found to elicit a P300 with 
maximum amplitude over the central/parietal rather than ffontal/central areas 
(Courchesne, 1978; Courchesne et al., 1978). This component is sometimes 
referred to as a “no-go” P300, because subjects do not respond to the 
inffequent nontargets (Falkenstein et al., 1995; Pfefferbaum et al., 1985). In 
the auditory modality, inffequent nontarget tone stimuli that are readily 
perceived (i.e., not novel) inserted into the traditional oddball sequence also 
elicit a central/parietal maximum P300 (cf Katayama & Polich, 1996a; 
Pfefferbaum & Ford, 1988). When both an infrequent nontarget tone and a 
novel sound are presented, the novel stimuli elicit a central maximum P300 
and the inffequent nontarget tone elicits a central/parietal P300, the 
amplitude of which is smaller than that of the novel stimulus potential (cf 
Grillon et al., 1990; Verbaten et al., 1997). Thus, the P300 component can 
vary in amplitude and timing, because the intra-paradigm stimulus 
relationships define the stimulus context (cf Katayama & Polich, 1996b; 
Suwazono et al., 2000). 

1.2 Stimulus Context 

Katayama and Polich (1998) assessed the role of task difficulty in the 
three-stimulus paradigm to examine stimulus context on the P300 scalp 
topography distribution. The perceptual distinctiveness between the target 
and standard stimuli was manipulated in an auditory task by using typical 
tone stimuli that varied in pitch. When the target/standard discrimination 
was easy and the distractor stimulus was highly discrepant, P300 target 
amplitude was larger than that elicited by the distractor stimulus, and both 
component types were largest over the parietal electrode sites. However, 
when the target/standard discrimination was difficult and the distractor 
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stimulus was highly discrepant, the distractor stimulus elicited a P300 that 
was greater in amplitude frontally and shorter in latency than the target 
P300. Additional studies have found that the repeated distractor “typical” 
stimulus elicits a P3a component that is larger in amplitude over the 
frontal/central locations and shorter in latency than the target P3b 
components (Comerchero & Polich, 1998, 1999). These results suggest that 
the engagement of frontal lobe attentional mechanisms elicited by a difficult 
target stimulus detection task is a defining aspect of the stimulus context that 
contributes to P3a generation. 
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Figure 2. Grand average ERPs (n=12) from different three-stimulus oddball stimulus 
conditions. Subjects to respond to a target stimulus 5.5 cm diameter target circle and do not 
respond to a standard stimulus 5.0 cm diameter circle or to the distractor stimuli. The 
distractor stimuli were 23.0 cm wide squares that were all blue and always the same or 
different color novel patterns, with the two distractor stimulus types presented in separate 
conditions. 

This hypothesis has been assessed systematically by comparing “typical” 
with “novel” distractor stimuli under easy vs. difficult target/standard 
discrimination tasks. Figure 2 provides an illustration of the critical ERP 
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results from a study that compared several distractor types in a visual three- 
stimulus task (Demiralp et al., 2001; Polich & Comerchero, 2002). The 
distractor stimuli were either a blue square that was the same for each trial or 
variegated colored patterns that changed on each trial, with both distractor 
stimuli much larger than the target and standard. The task was to 
discriminate a 5.0 cm diameter target stimulus circle from a 4.5 cm standard 
stimulus circle, with the P3a subcomponent elicited by the distractor 
stimulus types (23.0 cm^). The blue square and novel patterns produced P3a 
and P3b components that were virtually identical, such that the P3a 
components from the large squares were remarkably similar to those 
previously reported for “novel” stimuli (Courchesne et al., 1975; Simons et 
al., 2001; Squires et al., 1975). It is therefore reasonable to suppose that 
stimulus context — ^the relative perceptual distinctiveness among stimuli — 
determines both distractor and target P300 amplitude since each stimulus 
type produces distinct scalp topographic distributions. 



2. P300 THEORY 

2.1 Context Updating 

P300 amplitude is thought to index brain activity that is “required in the 
maintenance of working memory” when the mental model of the stimulus 
context is updated (Donchin et al., 1986, p. 256). Figure 3 illustrates this 
theoretical perspective and schematically portrays the updating processes 
hypothesized to produce the canonical P300 during oddball task 
performance. After initial sensory stimulus processing, a memory 
comparison evaluation is executed in which the current stimulus of the 
oddball sequence is compared to the previous stimulus. If no change in 
stimulus attributes is detected, the old “schema” or neural model of the 
stimulus environment is maintained, and sensory evoked potentials are 
recorded. However, if a new stimulus is processed the system engages 
attentional mechanisms to “update” the neural representation of the stimulus 
context and the P300 (P3b) is elicited, a process that is thought to index the 
ensuing memory storage operations, as P300 amplitudes are related to 
memory for previous stimulus presentations (Fabiani et al., 1990; Johnson, 
1995; Paller et al., 1988a). A variety of cognitive factors have been 
delineated in support of this view, with information content, stimulus 
probability structure, task relevance/difficulty, and stimulus properties all 
found to affect P300 measures (Donchin & Coles, 1988; Johnson, 1988b; 
Verleger, 1988). 
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Figure 3. Schematic illustration of the context-updating model for P300 theory. Stimuli enter 
the processing system and a memory comparison process is engaged that ascertains whether 
the current stimulus is either the same as the previous stimulus or not (e.g., a standard or a 
target stimulus in the oddball task). If the incoming stimulus is the same, the neural model of 
the stimulus environment is unchanged, and signal averaging of the EEG reveals sensory 
evoked potentials (NlOO, P200, N200). If the incoming stimulus is not the same and the 
subject discriminates the target from the preceding standard stimulus, the neural model of the 
stimulus environment is changed or “updated”, such that a P300 (P3b) potential is generated 
in addition to the sensory evoked potentials. 

P300 latency is considered to be a measure of stimulus classification 
speed unrelated to response selection processes (Kutas et al., 1977; 
McCarthy & Donchin, 1981; Pfefferbaum et al., 1986), such that its timing is 
independent of behavioral reaction time (Duncan-Johnson, 1981; Ilan & 
Polich, 1999; Verleger, 1997). Given that P300 latency is an index of the 
processing time that occurs before response generation, it provides a 
temporal measure of the neural activity underlying the processes of attention 
allocation and immediate memory. Further, component timing is negatively 
correlated with processing efficiency in normal subjects: Shorter latencies 
are associated with superior cognitive performance from neuropsychological 
tests that assess how rapidly attentional resources are allocated for memory 
processing (e.g., Houlihan et al., 1998; Polich et al., 1983, 1990b; Polich & 
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Martin, 1992; Reinvang, 1999; Stelmack & Houlihan, 1994). This 
association is also found in clinical studies that indicate P300 latency 
increases as mental capability is compromised by dementing illness (e.g., 
Homberg et al., 1986; O’Donnell et al., 1992; Polich et al., 1986, 1990a; 
Potter & Barrett, 1999). 

2.2 Attentional Resource Allocation 

P300 is derived from neural activity such that it is necessarily affected by 
the physical state of its underlying physiology and therefore reflects arousal 
level. This interaction occurs in two ways: (1) a general arousing effect and 
(2) a specific or idiosyncratic effect that contributes to a complex pattern of 
activation that modulates information processing (Kok, 1990; Pribram & 
McGuiness, 1975). Arousal’s tonic changes usually involve time periods on 
the order of minutes or hours and are manifestations of relatively slow 
fluctuations in the general or non-specific background arousal state of the 
individual, whereas phasic responses reflect the organism's energetic 
reaction to specific stimulus events. In this framework, tonic and phasic 
arousal effects originate from situational or spontaneous factors and affect 
the cognitive operations of attention and memory updating — i.e., the same 
processes hypothesized to underlie P300 generation (Polich & Kok, 1995). 
This theoretical interpretation is consonant with the context-updating view 
of P300 generation but additionally specifies a more general explanatory 
mechanism for the cognitive variables that affect this ERP component (cf. 
Donchin et al., 1986; Hillyard & Picton, 1987; Johnson, 1986). 



3. NEUROPSYCHOLOGY OF P300 

3.1 P300 and the Hippocampal Formation 

The precise neural origins and, therefore, the neuropsychological 
meaning of the P300 are as yet unknown although appreciable progress has 
been made in the last 20 years. Given the theoretical association of 
attentional and memory operations with P300, the first human studies on the 
neural origins of this ERP focused on the hippocampal formation. Initial 
reports employed depth electrodes that were implanted to help identify 
sources of epileptic foci in neurological patients. These recordings suggested 
that at least some portion of the P300 (P3b) is generated in the hippocampal 
areas of the medial temporal lobe (Halgren et al., 1980; McCarthy et al., 
1989). However, subsequent investigations using scalp recordings on 
individuals after temporal lobectomy (Johnson, 1988a; Smith & Halgren, 
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1989), experimental excisions in monkeys (Paller et al., 1988b, 1992), and 
patients with severe medial temporal lobe damage (Onofij et al., 1992; Rugg 
et al., 1991) found that the hippocampal formation does not contribute 
directly to the generation of P300 (Molnar, 1994). 

3.2 P3a Neural Substrates 

As outlined above, the P3a subcomponent is produced when the 
attentional focus required for the primary discrimination task is interrupted 
by an infrequent nontarget stimulus event, which does not have to be 
perceptually novel (Comerchero & Polich, 1999; Polich & Comerchero, 
2002). ERP studies on humans with frontal lobe lesions have demonstrated 
that P3a requires frontal lobe function (Knight, 1984). P3a from the novel 
distractor stimulus for the controls evinced frontal/central maximum 
amplitude, whereas P3b from the target stimulus produced a parietal 
maximum topographic scalp distribution. However, the frontal lesion 
patients demonstrated a clear diminution of the P3a subcomponent for the 
distractor stimulus, and the usual parietal maximum for the P3b from the 
target stimulus. These results imply that frontal lobe engagement is 
necessary for P3a generation and contributes to the larger role of these 
mechanisms in attentional control (Knight, 1990, Knight et al., 1995). 

CONTROL 

HIPPOCAMPAL 



Target Novel 




Figure 4. Grand average auditory target and novel stimulus ERPs from normal controls and 
bilateral hippocampal lesion patients (n=7/group). Controls demonstrate robust P3a and P3b 
components, whereas hippocampal patients demonstrate highly reduced P3a components over 
the frontal/central recording site (after Knight, 1 996). 
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Frontal lobe activity is not the only neural source for the P3a, as the 
hippocampal formation has been associated with the ERP processing of 
“novelty” in patients with focal hippocampal lesions (Knight, 1996). Figure 
4 illustrates the primary findings. P3a amplitude from novel auditory 
distractor stimuli for the controls yields the typical ffontal/central maximum 
scalp topography, whereas for the patients this subcomponent is virtually 
eliminated over frontal electrode sites. P3b amplitude from the target 
stimulus is generally similar between the groups at the parietal site as 
observed previously. Thus, P3a generation appears to require frontal lobe 
attentional mechanisms and hippocampal processes driven by novelty 
information processing (Knight, 1997). 

3.3 Frontal-to-Parietal Lobe Interactions 

Given this background, possible neuropsychological mechanisms for P3a 
and P3b generation can be developed. Figure 5 presents a schematic model. 
Discrimination between target and standard stimuli in an oddball paradigm is 
hypothesized to initiate frontal lobe activation that reflects the attentional 
focus required by task performance (Pardo et al., 1991; Posner, 1992), with 
ERP and neuroimaging findings demonstrating frontal lobe engagement for 
the detection of rare or alerting stimuli (McCarthy et al., 1997; Potts et al., 
1996; Verbaten et al., 1997). P3a is related to the neural changes in the 
anterior cingulate when incoming stimuli replace the contents of working 
memory, and communication of this representational change is transmitted 
to infero-temporal lobe stimulus maintenance mechanisms (Desimone et al., 
1995). P3b reflects the operation of memory storage operations that are then 
initiated in the hippocampal formation with the updated output transmitted to 
parietal cortex (Knight, 1996; Squire & Kandell, 1999). Although the exact 
pathways are not yet clear (Halgren et al., 1995a, 1995bb), a variety of 
evidence suggests that the hippocampal formation contributes to these 
events, even though it is not necessary for P3b generation (Johnson, 1988a; 
Polich & Squire, 1993). In sum, when a distracting stimulus commands 
frontal lobe attention a P3a is produced; when attentional resources are 
allocated for subsequent memory updating after stimulus evaluation, a P3b is 
produced to establish the connection with storage areas in associational 
cortex. 

As the model suggests, the neuroelectric events that underlie P300 
generation stem from the interaction between frontal lobe and 
hippocampal/temporal-parietal function as outlined above (cf. Kirino et al., 
2000; Knight, 1996). ERP and fMRI studies using oddball tasks have 
obtained patterns consistent with this frontal-to-temporal and parietal lobe 
activation pattern (He et al., 2001; Kiehl et al., 2001; Mecklinger et al.. 
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1998; Opitz et al., 1999). Further support comes from magnetic resonance 
imaging (MRI) of gray matter volumes and indicates that individual 
variation in P3a amplitude from distractor stimuli is correlated with frontal 
lobe area size, whereas P3b amplitude from target stimuli is correlated with 
parietal area size (Ford et al., 1994) — a finding that may underlie individual 
variability for observing P3a and P3b subcomponents from simple oddball 
tasks (cf. Polich, 1988; Squires et al., 1975). 




MEMORY UPDATING 



P3b 



Figure 5, Schematic model of cognitive P300 activity. Sensory input is processed in parallel 
streams, with frontal lobe activation from attention-driven working memory changes 
producing P3a and temporal/parietal lobe activation from memory updating operations 
producing P3b. See text for explanation. 



It also should be noted that the initial neural activation during auditory 
discrimination appears to originate from right frontal cortex (Polich et al., 
1997), and that P300 amplitude is larger over the right compared to left 
frontal/central areas (Alexander et al., 1995; Mertens & Polich, 1997). 
Hence, after initial frontal processing of the incoming stimulus, activity is 
propagated between the cerebral hemispheres across the corpus callosum 
(Barcelo et al., 2000; Baudena et al., 1995; Satomi et al., 1995). This 
hypothesis is supported by evidence that larger callosal fiber tracts are 
associated with greater P300 amplitudes and shorter latencies (Alexander & 
Polich, 1997; Polich & Hoffman, 1998), most likely because of increased 
inter-hemispheric communication (cf. Driesen & Raz, 1995; Witelson, 
1992). Thus, the P3a and P3b are distinct ERP components that arise from 
the interaction between frontal lobe attentional control over the contents of 
working memory and the subsequent long-term storage operations. 
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4. CONCLUSIONS 

This overview has attempted to summarize the mechanisms underlying 
P3a and P3b generation. As these brain potentials are related to fundamental 
aspects of mental function, they offer significant promise as a means to 
assess normative and impaired cognitive capability. Further assessment of 
their neuropsychological foundations will provide additional insight into the 
meaning of P300. The theoretical and methodological approaches outlined 
here are an attempt to provide a basis for this development and are derived 
from contemporary research findings on factors that govern P3a and P3b 
production. 
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1. PREFRONTAL CORTEX 

The prefrontal cortex (PFCx) can be divided into the lateral, orbital and 
medial PFCx, which all contribute to attentional and novelty processing and 
flexible behaviors. The cytoarchitecture of lateral prefrontal cortex (LPFCx) 
is highly organized, differentiated, distinctly layered, granular isocortex 
while the orbitofrontal cortex (OFCx) is structurally more heterogeneous, 
less differentiated, agranular limbic cortex (Barbas, 2000). The LPFCx is 
interconnected with parietal/occipital visual association areas, posterior 
parietal heteromodal areas, and inferior temporal visual association areas 
(Kaufer & Lewis, 1999). Other main circuitries of the LPFCx include 
reciprocal connections with the cingulate and the orbitofrontal cortex. In 
contrast, the OFCx has extensive direct and indirect connections with limbic 
areas such as the amygdala complex, hypothalamus, and the hippocampal 
formation (Cavada et al., 2000), with additional interactions to the inferior 
temporal visual association areas and LPFCx (Kaufer & Lewis, 1999). The 
main connections of the LPFCx to association areas and the OFCx to limbic 
areas determine the primary behavioral functions of these areas, with the 
LPFCx well-suited for attentional and executive function and the OFCx for 
affective and reward-related functions. 

The PFCx allows for departure from automated actions (Mesulam, 1986). 
Adjusting behavior depending on the current situation, social context, and 
foresight requires inhibiting responding to the most salient stimuli as well as 
inhibiting previously acquired responses. Thus, inhibition is an essential 




100 



Chapter 6 



component of cognitive flexibility and creative behaviors that tend to be 
compromised in patients with PFCx damage. Response inhibition has been 
suggested to be a general PFCx function that operates across different 
cognitive processes and brain regions (Roberts & Wallis, 2000). Both the 
lateral and the orbital PFCx perform general inhibitory functions, but the 
distinct cognitive processes that are modulated by these cortical areas differ 
and reflect the distinct neural circuitries in which they are imbedded. Dias et 
al. (1996, 1997) have suggested that the lateral prefrontal cortex is 
responsible for inhibitory control of attentional selection, while the 
orbitofrontal cortex is responsible for inhibitory control of affective 
responses. Hence, damage to the lateral prefrontal cortex leads to 
impairment in shifting attention from one perceptual dimension to another 
while damage to orbitofrontal cortex leads to an inability to alter behavior 
when the emotional significance of the stimuli change (Dias et al., 1996, 
1997). Deficits in inhibitory mechanisms lead to different clinical symptoms 
in patients with lateral and orbital PFCx damage. Lack of inhibition is a 
likely explanation for impulsivity, socially inappropriate or the disinhibited 
behaviors often observed after orbitofrontal damage (Levine et al., 1999), 
whereas deficits in attention, inflexible cognition, stimulus bound and 
perseverative behaviors may be signs of inhibitory deficits in lateral PFCx 
damage. 



2. ELECTROPHYSIOLOGY AND LESION 
METHODS 

Behavioral measurements, recordings of neuronal activity of single cells 
and neural populations, as well as blood flow changes in response to neural 
activity have been used to study PFCx function in both intact and lesioned 
brains. Despite limitations of each method, converging evidence from 
different techniques provides a more reliable and richer understanding than 
any single approach to the roles of prefrontal cortices in cognition, emotion, 
and behavior. The focus of this chapter is on results obtained from 
electrophysiological studies on neurological patients with focal lateral or 
orbital prefrontal damage. 

Electrophysiological techniques such as electroencephalography (EEG) 
and event-related brain potentials (ERPs) provide important approaches to 
study attention and other cognitive processes in humans (Naatanen, 1992). 
Lesion studies and functional magnetic resonance imaging (fMRI) help to 
delineate the brain regions engaged in cognitive processing. However, 
mental events occur so rapidly that fMRI methods are often not amenable to 
neuroimaging cognition (McIntosh et al., 1994; 1999). Despite improving 
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temporal resolution of the £MRI method, the sluggishness of the 
hemodynamic response underlying the fMRI signal restricts the temporal 
information into the range of seconds. ERPs have millisecond temporal 
resolution and are therefore well suited for assessing the kinetics of human 
cognition in real time. In addition, fMRI is susceptible to artifacts 
originating from neighboring anatomical structures (e.g., air-filled sinuses), 
which can compromise reliable imaging of brain regions such as the 
orbitofrontal cortex. Thus, combining electrophysiology with the lesion 
method provides both temporal and spatial information and allows insight 
into the dynamics and neural circuitry of cognitive processes. 

Neuroanatomical information from lesion studies (Knight et al., 1998; 
BCnight & Scabini, 1998), intracranial recordings (Baudena et al., 1995; 
Halgren et al., 1995a, Halgren et al., 1995b; Halgren et al., 1998) and 
combined neuroimaging and ERP studies (Heinze et al., 1994; Opitz et al., 
1999a; Opitz et al., 1999b) have delineated the neural regions responsible for 
generating several widely studied cognitive ERP components. For instance, 
attention sensitive visual ERPs, including a positive (PI, 110-160 
milliseconds) and a subsequent negative potential (Nl, 125-225 
milliseconds) have been localized to the extrastriate cortex (Gonzalez et al., 
1994; Hillyard & Anllo-Vento, 1998; Martinez et al., 1999), and fMRI 
studies have confirmed extrastriate attention modulation (Brefczynski & 
DeYoe, 1999; Chawla et al., 1999; Kastner et al., 1999). 
Electrophysiological and neurological techniques have also defined a 
distributed cortical-limbic network activated within 150-400 milliseconds 
after a novel irrelevant stimulus event (Alain et al., 1998; Halgren et al., 
1998, Knight, 1984). Novel stimuli generate the P3a ERP, which is a 
positive-going component that occurs at about 300-400 milliseconds and is 
maximal over the anterior scalp. This novelty ERP is proposed to be a 
central marker of the orienting response (Bahramali et al., 1997; Courchesne 
et al., 1975; Escera et al., 1998; Knight, 1984; Yamaguchi & Knight, 1991). 
ERP evidence derived from neurological patients and intracranial ERP 
recordings in pre-surgical epileptics has revealed that a distributed neural 
network including the lateral and orbital PFCx, hippocampal formation, 
anterior cingulate and temporal-parietal cortex is involved in detecting and 
encoding novel information (Halgren et al., 1998; Knight 1996; Knight 
1997; Knight & Scabini, 1998; Verleger et al., 1994; Yamaguchi & Knight, 
1991; Yamaguchi & Knight, 1992). Neuroimaging has provided 
confirmation on the neuroanatomy of this novelty processing system, which 
engages involuntary attention (Clark et al., 2000; Downar et al., 2000; 
McCarthy et al., 1997; Menon et al., 1997; Opitz et al., 1999a; Opitz et al., 
1999b; Stem et al., 1996; Tulving et al., 1994; Tulving et al., 1996; for a 
review see Friedman et al., 2001). 
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Strong convergence of lesion/ERP/fMRI data also has been obtained in 
voluntary attention paradigms. Voluntary stimulus detection generates a 
classic P300 or P3b potential that occurs between 300 to 700 milliseconds, 
has a posterior scalp maxima, and is primarily sensitive to attentional and 
cognitive factors rather than the physical properties of the stimulus (for 
reviews, see Picton, 1992 and Polich, 1998). The P300 can be triggered by 
detection of auditory, visual, somatosensory, and olfactory stimuli as well as 
by detection of missing stimuli in a train of irrelevant stimuli. The P300 
response to missing stimuli highlights the importance of cognitive factors 
over physical properties in the generation of these late ERP components. 

P300 in a voluntary target detection paradigm is referred to as P3b to 
distinguish it from P3a generated by task-irrelevant novel stimuli. P3a is 
maximal over fronto-central scalp areas and peaks in amplitude about 50 
milliseconds prior to P3b activity, which is maximal over parietal areas. 
Task-relevant and predictable stimuli lead to small P3a and large P3b 
responses, while unexpected and novel stimuli result in increased prefrontal 
P3a amplitude. Several explanations as to the functional significance of 
P300 have been offered, with most models focusing on attentional and 
mnemonic mechanisms. P300 amplitude depends on a variety of factors such 
as probability, context, and relevance of the stimuli, as well as the cognitive 
processes engaged by the behavioral task (Donchin & Coles, 1988; 
Katayama & Polich, 1998) The lack of a unitary theory on the functional 
significance of the P300 reflects the fact that multiple brain regions and 
cognitive processes generate scalp positivities between 300 to 700 
milliseconds after stimulus presentation that contribute to P300. In addition, 
a variety of evidence indicates that the temporo-parietal junction contributes 
to P300 generation. More important, both P3b and P3a are attenuated by 
lesions in the temporo-parietal junction in all sensory modalities (Figure 1; 
Knight, 1997; Knight et al., 1989; Yamaguchi & Knight, 1991). 
Furthermore, event-related fMRJ studies have confirmed temporo-parietal 
junction activation during voluntary event detection (Clark et al., 2000; 
Linden et al., 1999; McCarthy et al., 1997; Menon et al., 1997). Thus, 
electrophysiological methods in conjunction with lesion studies have proved 
to be valuable approaches for identifying the neural circuitries involved in 
cognitive processing. 



3. LATERAL PREFRONTAL CORTEX 

The LPFCx has been implicated in multiple cognitive processes such as 
executive control, attention, language, and memory (Chao & Knight, 1998; 
Corbetta, 1998; Dronkers et al., 2000; Fuster et al., 2000; Knight et al., 1998; 
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McDonald et al., 2000). The crucial role of the LPFCx function in intact 
human cognition and behavior is indicated by the variety of neurological and 
psychiatric disorders linked to LPFCx dysfunctions, such as schizophrenia, 
depression, attention deficit disorder, stroke, Parkinson’s disease and frontal 
lobe dementia (Akbarian et al., 1995, 1996; Jagust, 1999; Miller et al., 1991; 
Rosen et al., 2001; Stamm et al., 1993; Weinberger et al., 1986; 1992, 
Wilkins et al., 1987). Studies using fMRI have also defined the role of the 
LPFCx in working memory, response conflict, novelty processing, and 
attention (Barch et al. 2000; Botvinick et al., 1999; D’Esposito et al., 1995, 
1999a; D’Esposito et al., 1999b; D’Esposito et al., 1999c; Downar et al., 
2000; Jonides et al., 1993, 1998; Owen et al., 1998; Prabhakaran et al., 2000; 
Rypma & D’Esposito, 2000). In accordance with the imaging results, 
patients with lesions to the LPFCx have deficits in working memory 
(Harrington et al., 1998; Muller et al., 2002; Stone et al., 1998), response 
monitoring (Gehring & Knight, 2000), novelty processing (Knight 1984; 
Knight & Scabini, 1998), and attention (Barcelo et al., 2000; Knight et al., 
1998). Evidence combining lesion, electrophysiological and flMRI data has 
also been essential for delineating the different roles of LPFCx in attentional 
mechanisms, including early modulation of primary sensory areas, later 
modulation of association areas as well as involvement in the novelty-driven 
involuntary attention network. 

Single cell recordings in monkeys (Rainer et al., 1998a, Rainer et al., 
1998b), lesion studies in humans (Barcelo et al., 2000; Knight, 1997; 
Nilesen-Bohlman & Knight, 1999) and monkeys (Rossi et al., 1999), as well 
as blood flow data (Bilchel & Friston, 1997; Chawla et al., 1999; Corbetta, 
1998; Hopfinger et al., 2000; Kastner et al., 1999; Rees et al., 1997) have 
linked LPFCx to early attentional modulation of extrastriate cortex. Lateral 
prefrontal cortex attention effects span from early modulation of extrastriate 
activity beginning 125 milliseconds after stimulus delivery to subsequent 
visual processing extending throughout the ensuing 500 milliseconds 
(Barceld et al., 2000). 

The LPFCx exerts both facilitatory and inhibitory modulation of 
posterior sensory and perceptual areas and contributes to both involuntary 
and voluntary attention networks. Facilitatory modulation of extrastriate 
activity can be detected as enhancement of visual PI and N1 potentials in the 
100-200 milliseconds range (Mangim, 1995). Facilitatory PFCx modulatory 
effects on task relevant stimuli are not limited to early sensory processing 
but also include later processing stages and brain areas (Barcelb et al., 2000). 
Even though in simple detection tasks LPFCx does not appear to have 
significant contribution to brain potentials (e.g., N2 and P3b) reflecting 
target detection (Knight & Nakada, 1998; Knight & Scabini, 1998), more 
demanding cognitive tasks seem to rely on LPFCx modulation of posterior 
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association areas (Swick, 1998; Swick & Knight, 1999). In addition to 
facilitatory modulation of relevant stimuli, LPFCx exerts inhibitory 
modulation of irrelevant stimuli. Consequently, LPFCx damage leads to 
increased distractibility (Bartus & Levere 1977; Chao & Knight, 1995; 1998; 
Malmo 1942; Woods & Knight, 1986). Increased distractibility is believed to 
partially explain the attentional deficits after brain damage (Kaipio et al., 
1999). Electrophysiological signs of increased distractibility are enhanced 
ERP potentials to task-irrelevant stimuli, such as primary auditory cortex 
evoked response amplitude increase to distractors in LPFCx patients (Chao 
& Knight, 1998). This inhibitory control of early sensory processing has 
been linked to a prefrontal-thalamic gating system (Guillery et al., 1998; 
Knight etal., 1998). 



Targets Novels 



Auditory Visual Somatic Auditory Visual Somatic 




Figure 1. Grand averaged ERPs from lesion patient and control groups in a simple detection 
paradigm across stimulus modalities illustrate the effects of frontal, temporo-parietal junction, 
parietal, and hippocampal lesions on target and novelty processing. Brain images at the left 
illustrate the average lesion location. Prefrontal lesions reduced novelty P3a across 
modalities, but had no effect on the target P3b. Temporo-parietal junction lesions affect both 
novelty P3a and target P3b leading to marked reductions in amplitudes to auditory and 
somatosensory stimuli and partial reduction to visual stimuli. Lateral parietal lesion had no 
significant effect on either P3a or P3b amplitudes or latencies. Hippocampal damage leads to 
significant reductions in P3a over the frontal sites, but P3b remained intact. 

In addition to its critical role in voluntary attention, the LPFCx is a key 
component of the novelty network that engages involuntary attentional 
mechanisms. Mismatch negativity (MMN) studies indicate that lateral 
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prefrontal cortex initiates the novelty detection cascade prior to activation of 
other brain regions. If the novel event is sufficiently engaging, posterior 
cortical and medial temporal regions are recruited for further processing 
(Alain et al., 1998; Alho et al., 1994; Knight, 1996). PFCx novelty activation 
recorded with ERPs or neuroimaging habituates to repeated exposures to 
novel events and is modality independent (Knight, 1984; Knight & Scabini, 
1998; Peterson et al., 1999; Raichle et al., 1994; Yamaguchi & Knight, 
1991). Furthermore there is a marked reduction of novelty P3a in patients 
with LPFCx damage, whereas the P3b in a simple detection paradigm 
remains largely unaffected (Knight & Scabini, 1998, Figine 1; Daffoer et al., 
2000). These findings highlight the significant contributions of the PFCx to 
involuntary attention networks. 

Involuntary and voluntary attentional mechanisms rely on distributed 
neural networks comprised of multiple brain areas. Figure 1 illustrates the 
value of the lesion method in determining contributions of specific brain 
areas to attentional and novelty processing. Similar to LPFCx, lesions of the 
hippocampal formation lead to clear P3a amplitude reduction, but have no 
significant effect on P3b (Knight, 1 996). This finding provides evidence for 
the role of the hippocampal formation in novelty detection. Contrary to its 
involvement in novelty P3a generation, the hippocampal formation does not 
seem to contribute significantly to most scalp recorded target P3bs. In 
contrast, lesions of the temporo-parietal area lead to reduction in both P3a 
and P3b in all modalities (Knight, 1997; Knight et al., 1989; Verleger et al. 
1994; Yamaguchi & Knight, 1991). Hence, the temporo-parietal junction is 
critical in multimodal processes involving both irrelevant novel and relevant 
recurring events engaging both involuntary and voluntary attentional 
mechanisms. Lateral parietal lesions can affect either P3a or P3b. Lateral 
parietal lesions serve as a brain-damage control comparison, since the ERP 
amplitude reductions from focal brain damage to the LPFCx, temporo- 
parietal junction, and hippocampal formation are not the result of general 
brain lesion effects, but rather are specific to lesion location and disrupting 
the circuits involved in novelty and target processing. 



4. ORBITAL PREFRONTAL CORTEX 

The extensive neuroanatomical connections of the orbital prefrontal 
cortex with the limbic system (Barbas, 2000; Cavada, 2000; Price, 1999), as 
well as its connections to lateral prefrontal cortex make it a region well 
suited for integrating emotion and motivation with cognition and behavior. 
The orbitofrontal cortex is believed to play a variety of roles in guiding 
adaptive, motivated, and emotion regulated behaviors. Clinical evidence 
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since the landmark case of Phineas Gage in 1848 has highlighted the 
significance of the orbitoffontal cortex in emotional and social behavior 
(Dimitrov et al., 1999; Eslinger, 1999; Harlow, 1993; Macmillan, 2000; 
Nies, 1999). Lesions of the orbitoffontal cortex result in impaired social 
skills, emotional lability, and decreased impulse control. 

In contrast to well-preserved cognitive skills and only subtle difficulties 
in formal neuropsychological tests, the effect of orbitoffontal damage on a 
patient’s social behavior is considerable and may cause significant adverse 
consequences in their personal lives. Several reasons underlie impaired 
social skills in orbitoffontal patients: impaired insight (Leduc et al., 1999), 
difficulty in inferring mental states of others (Stone et al., 1998), failure to 
use emotions in guiding decisions (Bechara et al., 2000; Damasio, 1996), 
deficits in emotion recognition (Homak et al., 1996), and impaired 
knowledge of moral rules or an inability to apply them (Anderson et al., 
1999). Further functions assigned to the orbitoffontal cortex that may be 
crucial for successful social behavior include labeling reward values to 
outcomes of voluntary action and updating the reward contingencies in a 
rapidly changing environment (Rolls, 2000), inhibiting previously but no 
longer rewarded behaviors (Dias et al., 1996, 1997), and modulating 
orientation to irrelevant environmental stimuli (Rule et al., in press). 

Although patients with orbitoffontal lesions often present with emotional 
lability and impulsive behavior, advanced dorsolateral preffontal cortex 
lesions typically result in blunted affect and apathy. The blunted emotional 
state seen in dorsolateral patients is accompanied by attenuation of 
electrophysiological responses to novel stimuli (Paradiso et al., 1999), 
reflecting impairment in involuntary attentional mechanisms (Knight, 1984, 
Figure 1). In contrast, orbitoffontal patients show electrophysiological 
enhancement to novel environmental sounds (Rule et al., in press) as well as 
context dependent enhancement of responses to recurring events 
(Hartikainen et al., 2001, Figure 2). These enhanced electrophysiological 
responses suggest a failure in inhibitory mechanisms that may also underlie 
the impulsive behavior observed in OFCx patients. 

In contrast to diminished late ERP amplitudes seen in association with 
brain damage to LPFCx, temporo-parietal junction, and hippocampal 
formation, significantly enhanced ERPs are observed subsequent to OFCx 
damage. These enhanced ERP responses provide electrophysiological 
evidence for the inhibitory role of OFCx in humans (Hartikainen et al., 2001; 
Rule, 2000). Unlike in LPFCx damage, where the impairment of inhibitory 
control can be observed using traditional neuropsychological testing such as 
the Wisconsin card sorting test, in OFCx damage many neuropsychological 
“ffontal lobe” measures remain intact (Stuss at al., 2000). Thus, 
neuropsychological testing often fails to detect deficits in the inhibitory 
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control in OFCx patients, despite sometimes significant adverse effects on 
everyday life due to disinhibition, whereas enhanced ERPs show promise for 
laboratory detection of neural disinhibition in OFCx patient’s responses 
(Hartikainen et al., 2001; Rule et al., in press). 
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Figure 2. Enhanced ERPs to visual targets subsequent to bilateral orbitofrontal damage. 
Lesion reconstruction of 6 bilateral orbitofrontal patients is shown in the upper part of the 
figure. Average lesion location is indicated with shades of gray in the orbitofrontal area. The 
gray scale corresponds to lesion overlap across patients. The orbitofrontal lesions included 
bilateral damage in areas 10, 11, 12, and 13 with maximal damage in ventromedial 
orbitofrontal cortex. Orbitofrontal patients and age-matched controls discriminated between 
upright and inverted triangles (target). Targets were randomly presented in the left (LVF) or 
right visual (RVF) hemifield (150 milliseconds). A brief task-irrelevant novel (150 
milliseconds) stimulus selected from international affective picture system (Center for the 
Study of Emotion and Attention, 1999) was presented centrally 350 milliseconds prior to the 
target. Difference wave reflecting LVF and RVF target processing, with the target ERP from 
the preceding novel stimuli subtracted (i.e., ERP to novel stimuli not followed by a target is 
subtracted from ERP to targets preceded by novel stimuli). The target stimuli waveforms with 
maximal amplitudes are shown for the LVF target from F3 and for the RVF from P3. 
Significantly enhanced target ERPs were observed in patients with OFCx lesions, with frontal 
P3 enhancement to LFV targets and posterior N2 enhancement to RVF targets (after 
Hartikainen et al., 2001). 
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OFCx seems to exert inhibitory modulatory control over anterior and 
posterior brain regions. Furthermore this modulation appears to be 
hemisphere specific. Figure 2 illustrates this hemispheric asymmetry with 
ERP enhancement subsequent to bilateral orbitofrontal lesion observed, 
which suggests distinct modulatory effects on the left and the right 
hemispheres. ERPs to left visual field (LVF) targets showed ffontocentral P3 
enhancement, while right visual field (RVF) targets were associated with 
significant parieto-temporal N2 enhancement in orbitofrontal patients. 
Posterior N2 enhancement may reflect release of posterior association areas 
from orbitofrontal inhibitory control. Likewise the frontal P3 enhancement 
may reflect loss of orbitofrontal inhibitory modulation of lateral prefrontal 
circuitries involved in attentional processes. There are some well-known 
hemispheric asymmetries in attentional processes such as more frequently 
observed hemispatial neglect following right hemisphere damage (Mesulam, 
1981). Attentional asymmetries in performance due to novel emotional 
stimuli have been reported in healthy subjects and in patients with OFCx 
damage (Hartikainen et al., 2000a; Hartikainen et al., 2000b; Hartikainen et 
al., 2001). Asymmetries in attentional processes and in OFCx-hemisphere 
interactions may therefore be reflected in these asymmetrically enhanced 
ERP patterns observed after bilateral orbitofrontal damage. 



5. CONCLUDING REMARKS 

ERP lesion studies have provided converging evidence with animal 
research and brain imaging methods and have helped to clarify the roles of 
prefrontal cortex in attentional mechanisms. Damage to LPFCx impairs early 
facilitatory modulation of extrastriate processing of relevant stimuli as well 
as disrupts inhibitory modulation of distracting irrelevant events. These 
findings demonstrate the integral role of the LPFCx in voluntary attentional 
selection and in contributing to reliable neural signal-to-noise ratio in 
posterior sensory and perceptual brain areas. In addition to early modulation 
of sensory areas, the LPFCx is involved in voluntary attention, modulating 
posterior association areas and contributing to P3b when demanding 
cognitive operations are required. More automated tasks do not seem to rely 
on LPFCx modulation as evidenced by preserved P3b in a simple detection 
task after LPFCx damage. In addition to deficits in voluntary attentional 
mechanisms, involuntary attentional mechanisms that are engaged by novel 
stimuli are disrupted in lateral prefrontal cortex damage. ERP evidence for 
significant disruption of novelty processing in lateral prefrontal damage is 
clear. 
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In comparison to significant reduction of novelty P3a in lateral prefrontal 
damage (Knight & Scabini, 1998, Figure 1), the enhanced, rather than 
diminished, responses to auditory and somatosensory novel stimuli suggest 
no significant disruption in general novelty processing after orbitofrontal 
damage (Rule et al., in press). The orbitofrontal cortex seems to modulate, 
rather than generate, novelty responses. Habituation depends on this 
modulatory effect, and failure in habituation to auditory and somatosensory 
novel stimuli is observed in orbitofrontal patients (Rule et al., in press). 

The orbitofrontal cortex seems to play a complex and context dependent 
inhibitory modulatory role that is evident from enhanced late ERP responses 
to both task-relevant and irrelevant stimuli (Figure 2; Hartikainen et al., 
2001; Rule et al., in press). Amplitudes of the late N2 and P3 ERP 
components to targets that were preceded by novel stimuli were significantly 
enhanced. Despite bilateral orbitofrontal damage, the pattern of ERP 
enhancement depended on the field of target presentation. This suggests 
distinct and lateralized OFCx-hemisphere interactions during attention 
modulation. Left visual field targets produced enhanced frontal positivity 
that may reflect release of lateral prefi'ontal processes, possibly similar to 
those involved in generating P3a. Right visual field targets produced left 
posterior N2 enhancement, possibly reflecting resource allocation to target 
discrimination in the posterior parieto-temporal association areas in the 
absence of orbital inhibitory modulation of posterior areas. The functional 
significance of these ERP findings remains to be established. However, it is 
apparent from these results that there is a clear modulatory effect of 
orbitofrontal cortex on both anterior and posterior brain structures in 
attentional processes. 

Future fMRI studies on patients with focal brain damage to orbitofrontal 
cortex may provide a more detailed picture of the specific brain areas 
involved in these asymmetrically disinhibited ERP responses. Converging 
results from ERP studies in focal brain damage and fMRI studies on intact 
brain have consolidated many of the theories related to the role of the 
prefrontal cortex in attention. To further elucidate the contribution of the 
lateral and orbital prefrontal cortices in attentional modulation of other brain 
areas, fMRI studies on patients with frontal damage using similar paradigms 
to those used in ERP studies may prove useful. 
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ERP AND fMRI CORRELATES OF TARGET AND 
NOVELTY PROCESSING 
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1. INTRODUCTION 

Human behavior presupposes the ability to respond actively to 
biologically significant events. One important behavior that occurs within 
the context of the interaction of an organism and the environment is the 
detection of changes. Such change detection may be regarded as the outcome 
of processes that include extraction of stimulus information, reallocation of 
attention, and sensory memory. Discrete lesions in frontal, temporo-parietal, 
or medio-temporal cortices, can disrupt the behavioral processes associated 
with different aspects of change detection. What lesion studies cannot 
inform, however, is which regions in the normal brain subserve the detection 
of a change. A combined analysis of two major methods in cognitive 
neuroscience, event-related potentials (ERPs) and functional magnetic 
resonance imaging (fMRI), can address the issue of spatio-temporal 
characteristics of normal brain activation underlying change detection. 



2. TWO PROCESSES, MULTIPLE BRAIN 
STRUCTURES 

The present chapter considers change detection processes that are 
triggered by events that are distinct from the majority of the occurring events 
and either signal task relevant information (target detection) or a 
perceptually novel event (novelty detection). Change detection processes are 
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often assessed by means of the so-called "oddball task". This procedure 
requires subjects to attend to a stimulus stream and to discriminate between 
frequently occurring regular events and rare, irregular events (Donchin et al., 
1997). The regular events are called standards and the irregular events are 
referred to as targets, because in most studies a covert or overt response 
(mental counting, a button press) is required to induce task relevance of the 
stimulus events. When such a target is consciously detected, a parietally 
distributed positive ERP component is generated that is known as the P300 
or P3b (Sutton et al., 1965; Donchin, 1981; Donchin & Coles, 1988). There 
is accumulating evidence from neurophysiological studies (Knight, 1996; 
Halgren et al., 1995) that several brain structures are involved in different 
aspects of target detection. These structures include the medial temporal 
lobe, the frontal cortex, the supramarginal gyrus, and the anterior cingulate 
gyrus. Studies using fMRI also have found activation of the lateral prefrontal 
cortex, the insular cortex bilaterally, and subcortical structures, such as the 
thalamus during target detection processes in the auditory and visual 
modalities (McCarthy et al., 1997; Linden et al., 1999). In addition, source 
localization performed on the basis of magnetoecephalographic data 
suggests contributions from subcortical structures (Mecklinger et al., 1998). 

As with target detection, novelty detection processes can be examined in 
the visual (Courchesne et al., 1975), auditory, and somatosensory (Knight, 
1996; Polich et al., 1991) modalities. Novel events cause an involuntary 
attentional shift because of their potential behavioral significance, even 
when they occur outside the current focus of attention (Knight, 1984). The 
neurophysiological mechanisms underlying the detection of novelty are 
reflected in the novelty P3 ERP. Compared to the P3b elicited by targets, the 
novelty P3 has a shorter latency and a more frontal scalp topography 
distribution (cf. Friedman & Simpson, 1994). 

The novelty P3 scalp topography appears to reflect the activity of a 
widespread neuronal network including the frontal and the parietal lobes, as 
well as lateral and medial temporal lobe structures (Alho et al., 1998; 
Mecklinger & Ullsperger, 1995). Additionally, research on epileptic patients 
using depth electrode measurements also suggests that the novelty P3 
reflects the activity of a distributed network, with major components in the 
hippocampus, the temporal lobes and dorsolateral prefrontal cortex 
(Baudena et al., 1995; Halgren et al., 1995). 

Taken together, these findings indicate that the processes involved in 
target and novelty detection constitute a widespread and partially 
overlapping neuronal network. However, the exact neuronal substrates of 
these change detector networks are not completely understood. Until 
recently, electrophysiological methods such as ERPs and hemodynamic 
approaches like fMRI were used separately to disentangle the 
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neuroanatomical structures mediating change detection. The present chapter 
reviews findings from the use of these methods together. 



3. AN INTEGRATED APPROACH TO THE 
DETECTION OF CHANGE 

Change detection processes occur within seconds and, therefore, require 
an online measure with millisecond time resolution, such as that provided by 
ERPs. These neuroelectric measures are extracted from the ongoing 
electroencephalogram (EEG) by signal averaging and are small voltage 
changes that are time-locked to sensory, motor, or cognitive events (Rugg & 
Coles, 1995). The resulting waveform can be described as a series of 
positive or negative deflections called ERP components with a specific 
latency and amplitude distribution or topography over the scalp. These 
components index the timing and sequence of neuronal activity elicited by a 
particular event. A class of ERP components can be elicited by the detection 
of an event deviating in some manner from the other stimulus events 
presented in the experiment (Donchin et al., 1997). However, these change 
detection components differ among themselves in the nature of the deviating 
event (e.g., small frequency deviants or perceptually novel events) or in the 
extent to which they are elicited by attended (e.g., the P3b) and/or 
unattended (novelty P3) events (Friedman et al., 1998). 

Although the temporal resolution of the ERP is excellent, its capability in 
identifying the generating neural sources of the functionally relevant brain 
structures is necessarily approximate. The difficulty is to solve the “inverse 
problem,” which determines the three dimensional distribution of active 
areas inside the brain, based solely on two dimensional external electric or 
magnetic measurements. Given a finite number of electrodes at which the 
potentials are measured, there exist an infinite number of possible solutions 
of the inverse problem and, therefore, a potentially infinite number of brain 
structures can account for the generation of the measured scalp potentials 
(Koles, 1998). Despite this difficulty, source analyses of the P3b and the 
novelty P3 have been performed (cf. Alho et al., 1998; Mecklinger & 
Ullsperger, 1996; Mecklinger et al., 1998). Indeed, if the problem is 
approached by utilizing prior information, one can substantially reduce the 
number of possible solutions. For example, anatomical knowledge can be 
used to constrain sources to locations around the cortical fold (Scherg & 
Berg, 1991). In addition, the activity of neighboring neurons is more likely 
synchronized compared to the activity of neurons that are far from each 
other. In mathematical terms, the task is to find the smoothest of all possible 
solutions of a distributed activity throughout the brain. This method is called 
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low resolution electromagnetic tomography (LORETA, Pascal-Marqui et al., 
1994) and has been employed successfully (Anderer et al., 1998). 

An alternative approach to the neuronal localization of cognitive 
processes is provided by other neuroimaging methods such as fMRJ. The 
measurement technique is based on changes in local blood oxygenation and 
flow that occur with changes in neural activity. When the appropriate 
imaging parameters are chosen, fMRI is able to detect changes in blood 
oxygeneration level since oxygenated hemoglobin has much smaller 
magnetic susceptibility than deoxygenated hemoglobin. This outcome is 
known as Blood Oxygenation Level Dependent Effect or BOLD signal 
(Ogawa & Lee, 1990). As suggested by anatomical and physiological 
evidence, fMRI measures are limited by the spatial extent of the circulatory 
system detected at a given field strength. Hence, the higher the field 
strength, the higher the spatial resolution, with spatial resolution as high as 1 
mm possible (Cohen, 1996). The temporal resolution is limited by the 
properties of the brain's circulatory system. Most changes of this 
hemodynamic response that are detectable with fMRI appear after a delay of 
several seconds and take about 6 seconds to reach maximum. Thus, despite 
the excellent spatial resolution, fMRI does not provide the temporal 
resolution required to make inferences about the subprocesses involved in 
change detection (Menon & Kim, 1999). 

In an effort to overcome the intrinsic limitations of each approach, both 
the electrophysiological and hemodynamic measures of change detection 
processes were integrated using neuroanatomically constrained source 
analysis (cf. Opitz et al., 1999a). The fundamental assumption underlying 
this method was that the same brain areas that generate the P3b or the 
novelty P3 in the ERP also would show an increased fMRI BOLD response. 
The fMRI results can then be used to constrain the inverse problem by 
providing the number and locations of possible sources for the associated 
scalp potentials (Mecklinger, 2000). Thus, by modeling the neuronal electric 
activity employing these neuroanatomical constraints, brain structures can be 
identified that are involved in change detection while also monitoring the 
temporal dynamics of the same neural locations. 

For a combined analysis of fMRI and ERP, it would be useful to 
simultaneously record both types of data. A problem with this approach is 
that the changing magnetic field produced by the MR-scanner, in addition to 
the electrical wire moving due to heart beat-related body movements, 
induces electric currents that affect the EEG signal amplifier. It is therefore 
technically difficult to record artifact-free ERPs within the magnetic field of 
the scanner (Allen et al., 1998). For this reason, parallel data acquisition in 
separate sessions with the same subjects employing identical experimental 
procedures can be used. However, with the improvement of recording 
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equipment and new signal processing methods the scope of combined 
ERP/fMRI studies can be extended considerably (Allen et al., 2000). 

Although the fMRJ constrained source analysis has a physiological basis, 
it is not reasonable to assume that the distribution of fMRI and ERP signals 
will always perfectly match. To exclude all but the one correct solution, it is 
important that a derived source model is subjected to statistical evaluation 
and fulfills a number of criteria. First, it should explain the empirical 
electrophysiological data very well by accounting for as much of the 
experimental variance as possible (typically greater than 90%). Second, the 
estimated source configuration should be in agreement with structural and 
physiological restrictions. That is, the ERP data are assumed to arise from 
electrical activity in the gray matter with a dipolar orientation perpendicular 
to the cortical surface. Third, a limited number of neurons cannot produce an 
unlimited electric field outside the head. According to realistic estimates 
based on measured current densities, a dipole strength of 10 nAm would 
correspond to 40 mm^ of active cortex (Freeman, 1975). As a consequence, a 
brain area of a certain spatial extent, for example 30200 mm^, while showing 
an increased BOLD response, nonetheless has to be excluded as an ERP 
generator if the activation strength of the respective dipole exceeds a specific 
threshold, or 50 nAm in this example. One reason for the absence of an ERP 
signal even though a BOLD signal is present, is that scalp recorded ERPs are 
generated only by a synchronous modulation of a neuronal population in an 
"open field" configuration (Nunez, 1981), whereas a hemodynamic response 
can be caused by any configuration of neuronal activity. Conversely, neural 
activity as reflected in the ERPs might not have a detectable hemodynamic 
counterpart. For instance, this may be the case when relatively few neurons 
are synchronously highly active but the regional vascular bed is sparse, as is 
the situation in the hippocampus. The experimental variance unexplained by 
a generator model constrains the likelihood estimate of such activity: the 
more unexplained variance, the more likely that sources are missing. 
Therefore, this combined analysis is a probabilistic approach (Opitz et al., 
1999b; Luck, 1999). 

Finally, the model has to be specific for a particular ERP component, 
such that it explains the experimental variance of this component but not of 
others. Moreover, the activation strength of the source model should also be 
maximal in the time range of the specific ERP, thereby indicating a maximal 
contribution solely to this component. This specificity of a source model can 
be estimated by calculating the time course of dipole strength and goodness- 
of-fit over the entire period of measurement. 
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4. AN EXAMPLE OF COMBINED ANALYSIS 

The application of this integrated approach is illustrated in two 
experiments that employed either task-relevant targets or perceptually novel 
auditory stimuli. The electrophysiological and hemodynamic brain responses 
were measured in healthy volunteers in two separate sessions of identical 
task situations: a stimulus train comprised of different auditory stimuli where 
the rare targets had to be silently counted by the subjects (Opitz et al., 
1999a). In the second experiment, novel stimuli were included in the 
stimulus train (Opitz et al., 1999b). Findings from studies investigating the 
functional similarity in the processing of environmental novel sounds and 
words suggest that environmental sources, like words, can activate 
conceptual-semantic representations (van Petten & Rheinfelder, 1995). Thus, 
the novel stimuli were divided into two groups: identifiable novel sounds, 
which were reliably identified by subjects (e.g., telephone bell or dog bark) 
and nonidentifiable novels, which were not (Mecklinger et al., 1997). In 
order to assess temporal aspects of the processing network identified in 
fMRI activation maps, a neuroanatomically constrained dipole analysis was 
employed. Dipole locations were kept fixed according to the fMRI activation 
foci averaged across subjects, whereas dipole orientations were allowed to 
vary to model the ERP data (Opitz et al., 1999b). 

Figure 1 shows the ERPs for midline electrodes in response to the targets, 
the identifiable novels, and the nonidentifiable novels. As expected, targets 
elicited a parietally maximal P3b component, with a peak latency of 360 ms 
at the Pz recording site. Conversely, both types of novels elicited fronto- 
centrally focused novelty P3s peaking around 280 ms. Notably, identifiable 
compared to nonidentifiable novel sounds elicited a parietal negative-going 
deflection at right parietal recording sites peaking around 420 ms. In light of 
the temporal and topographical similarities with the N400 component 
produced by semantically unexpected language stimuli this negativity can be 
described as an N400-like component (Kutas & Hillyard, 1983). 

Figure 2 illustrates the overall fMRI data. Four significant clusters of 
BOLD increase were obtained with peaks in the posterior part of the left and 
right superior temporal gyri (STG), adjacent to the supramarginal gyri 
(SMG) and the left and right neostriatum. For modeling the ERP waveforms, 
generators at all four locations derived from the fMRI activation pattern 
were first assumed. However, dipoles located at bilateral neostriatum 
showed an activation strength of more than 200 nAm. Based on the 
measurements by Freeman (1975), this activation strength would correspond 
to 8cm^ of active cortex. Since the neostriatum is much smaller, the 
respective dipoles were excluded from the final dipole solution. 
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Figure 1. Grand average waveforms to the targets and both novel types at the midline 
electrode. Solid line=identifiable novels, dotted line=nonidentifiable novels, dashed 
line=deviant tones. Negative number on top, positive number on bottom. 
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Figure 2. Brain areas that showed significant fMRI activation to target stimuli were 
superimposed on an individual brain in Talairach space; left and right superior temporal gyri 
(STG), left supramarginal gyrus (SMG) and neostriatum. Horizontal (left) and coronal 
sections (right) were thresholded at Z=3.01 (p < .001, one tailed) (after Opitz et al., 1999a). 

Figure 3 presents the goodness-of-fit and the activation waveforms of the 
derived solution. This source analysis suggests that only the bilateral 
activation of the posterior part of the superior temporal gyrus contributes to 
the scalp P3b elicited by auditory stimuli. This finding is consistent with 
previous findings (Menon et al., 1997) and source localization performed on 
the basis of MEG data (Alho et al., 1998), which suggest contributions from 
posterior temporal cortex. 

In the second experiment, the fMRI activation pattern associated with 
auditory novelty processing was comprised of two significant clusters of 
activity in the midportion of the left and right STG. Figure 4 illustrates the 
fMRI findings. No significant differences in mean fMRI activation size and 
or specific lateralization could be obtained for either novel type. 
Furthermore, this STC activity was located anterior to the hemodynamic 
response to target tones. This outcome could account for the distinctive scalp 
distribution of the novelty P3. Consistent with this result, a contribution of 
anterior temporal cortex activity to the novelty P3 has been observed in an 
MEG study (Alho et al., 1998). Moreover, extensive temporo-parietal 
lesions centered in the superior temporal cortex attenuated the P3 to novel 
sounds, especially at posterior recordings (Knight et al., 1989). Despite this 
converging evidence for superior temporal gyrus contributions to novelty 
processing, there is a lack of consistency with respect to hippocampal 
involvement in these processes. Human lesion studies (Knight, 1996) and 
neuroimaging studies (Martin, 1999; Tulving et al., 1996) have suggested an 
important role of the hippocampal/parahippocampal region in the processing 
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of novel information. The present data, in agreement with previous P3b 
studies did not reveal significant activation within or in the vicinity of this 
brain area (Polich & Squire, 1993). At a functional level a possible 
explanation of the discrepancy between the present and former data could be 
derived from neuroimaging studies that demonstrated the involvement of 
hippocampal structures in a large variety of cognitive processes, such as 
memory encoding and retrieval (Desgranges et al., 1998). It is conceivable 
that such processes were also produced by target and standard tones and 
could therefore have masked the effects of novelty detection in the 
hippocampus. 




nAm 





4,0 + 4.0 



DATA 
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Figure 3. Goodness of fit (% explained variance) as a function of time (top panel), time 
course of dipole strength (nAm) (middle panel). The scalp potential maps of the fit interval 
for empirical data (left) and the dipole model (right) are shown. Solid-line=left superior 
temporal gyrus, dotted line=right superior temporal gyrus (after Opitz et al., 1999a). 
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Figure 4. Comparison of fMRI activation to novel sounds (left) and target tones (right) in 
sagittal section through the right temporal lobe. The activations differ along the anterior- 
posterior axis of the superior temporal gyrus, with novel sounds aetivity loeated anteriorly. 

Given the differential ERP waveforms obtained for identifiable and 
nonidentifiable novels, the absence of any differences in the fMRI activation 
pattern was surprising. A median split of the sample was constructed 
according to the most salient ERP differences between identifiable and 
nonidentifiable novels, and the N400-effect was only obtained for 
identifiable novels. Subjects showing a large N400-effect will be referred to 
as the large difference group (LD-group), while the other subjects, with a 
weak or no N400-effect formed the small difference group (SD-group). 
Figure 5 displays the ERP waveforms of both novel stimuli at a 
representative electrode from the right parietal region and the scalp 
topography of the N400-effect for both groups. When the fMRI data analysis 
was conducted separately for the LD and SD groups, the activation pattern 
for identifiable and nonidentifiable novels was clearly dissociable for the 
LD-group, but not for the SD-group. Figure 6 illustrates those effects. In the 
LD-group bilateral activation of the midportion of the STC was obtained for 
the nonidentifiable and identifiable novels. In this group, additional right 
frontal activation for identifiable novels was found. 

Based on these results it can be hypothesized that (a) bilateral activation 
of the middle STG accounts for the generation of the auditory novelty P3 
whereas (b) an additional right frontal generator might contribute solely to 
the ERP waveforms for identifiable novels. To test these hypotheses, the 
scalp ERP distribution of the LD-group was modeled, for which reliable 
ERP differences between both novel types were obtained, using dipole 
source locations in both STG and in the right frontal area, as derived from 
functional images. 

Examination of the time course of dipole activation in the LD group 
revealed that the dipoles in the STG for identifiable as well as 
nonidentifiable novels explained more than 90% of the signal variances 
within the time range from 260 to 360 ms only. Figure 7 summarizes the 
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findings. These results support the view that the middle STG is one of the 
contributors to the scalp recorded novelty P3. The third dipole in this model 
accounted for 47% of the variance in the ERP elicited by identifiable novels 
but only 15% in the ERP elicited by nonidentifiable novels. The right frontal 
dipole showed a clear maximum of activation strength subsequent to the 
novelty P3 for identifiable but not for nonidentifiable novels. This outcome 
is consistent with the ERP data and further strengthens the observation that 
the right frontal BOLD response is apparent for identifiable but negligible 
for nonidentifiable novels. 



LD-group 



SD-gronp 






GO «6.0 

Figure 5. ERP waveforms for a representative electrode of the right parietal region (upper 
panel) and the scalp distribution (lower panel) of the N4-effect for the LD (left) and SD 
groups (right) are shown. Dark areas in the potential maps indicate positive differences and 
light areas negative differences between identifiable and nonidentifiable novel sounds. 
Negative number on top, positive number on bottom. 
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Figure 6. BOLD changes in the prefrontal cortex in response to identifiable (left) and 
nonidentiftable (right) novel stimuli for the LD-group obtained by averaging the groups 
separately. Note that the sliee where right frontal aetivation was obtained for identifiable 
novels is also shown for the nonidentiftable novels although no aetivation was present (after 
Opitzetal., 1999b). 





Figure 7. Percent explained variance (left) and dipole strength (right) of the dipole solution of 
the right frontal dipole for identiftable (solid tracer) and nonidentiftable (dotted trace) novels 
(after Opitz et al., 1999b). 

The involvement of prefrontal cortex in novelty processing was 
suggested by neuropsychological studies showing a decreased ERP response 
to auditory and visual novel stimuli after unilateral prefrontal lesions 
(Knight, 1984). However, the lack of selective amplitude reduction of the 
novelty P3 over the lesioned hemisphere led Knight (1984) to conclude that 
the prefrontal cortex is not the primary generator of these brain potentials, 
and that this activity more likely reflects attentional modulations of 
generators located elsewhere (e.g., in the superior temporal cortex). More 
evidence for the notion that the prefrontal cortex might act as an attentional 
control system was provided by a recent study that compared the novelty P3 
in young and old adults (Friedman et al., 1998). They found a reduced 
novelty P3 for old as compared to young adults, in a condition where 
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subjects had to attend to the stimulation but had showed identical novelty 
P3s for young and old adults in an ignore condition. This finding suggests 
that prefrontal activity during an attended novelty oddball task more likely 
reflects attentional modulations rather than a true generator of the scalp 
recorded novelty P3. Furthermore, as none of these studies reported a 
specific lateralization of frontal lobe involvement in novelty processing, 
processes other than novelty detection seem to be reflected in lateralized 
PFC activity in general, and in the rPFC found in the present study, in 
particular. The results of the combined analyses suggest that the rPFC is 
activated only for those novel events that activate a semantic concept and 
that this activation is delayed relative to processes underlying novelty 
detection. In recent functional imaging studies, rPFC activation of a similar 
kind has been found when previously learned material had to be retrieved 
from episodic memory (Brewer et al., 1998; Opitz et al., 2000). Moreover, it 
has been demonstrated that the activity of the prefrontal cortex in such 
memory tasks is modulated by intrinsic stimulus properties (i.e., identifiable 
and nonidentifiable sounds) as well as task demands (Opitz et al., 2000). hi 
light of these findings, the activity of the rPFC in the present study appears 
to reflect the encoding and/or retrieval of conceptual semantic information 
carried by identifiable novel sounds. 



5. CONCLUSION 

The data presented together with those of previous studies point to 
differential brain networks underlying target and novelty detection. The 
combined analyses of ERP and fMRI data suggest a major contribution of 
the superior temporal gyrus to the generation of the P3b and the novelty P3 
with the latter being generated more anterior loci. Moreover, this combined 
analysis supports the view that novelty processing consists of at least two 
sequential subprocesses: first, an automatically operating novelty detection 
mechanism, subserved by superior temporal structures and second, further 
processes based on a novel sounds' meaning, subserved by right frontal 
cortical areas. The precise nature of the psychological process reflected in 
this rPFC activation remains to be elucidated. Nevertheless, the present 
study demonstrates that this integrated approach provides a new opportunity 
to disentangle the temporal aspects of neural activation underlying auditory 
target and novelty detection. The remaining question is whether the same 
neuronal network underlying auditory change detection will also subserve 
change detection in the visual domain. A recent neuroimaging study has 
shown that similar brain areas including the supramarginal gyrus, insular 
cortex bilaterally, and circumscribed parietal regions are activated during 
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auditory and visual target detection (Linden et al., 1999). However, it 
remains unclear which of these brain structures contribute to the scalp 
recorded potentials in the respective stimulus modality. Future dipole 
analyses with neuroanatomical constraints can be utilized to tackle this 
problem. 
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1. INTRODUCTION 

The purpose of functional brain mapping is to localize patterns of neuronal 
activity associated with sensory, motor, and cognitive functions, or with disease 
processes. To be complete, an imaging modality needs near millimeter precision 
in localizing regions of activated tissue and sub-second temporal precision for 
characterizing changes in patterns of activation over time. Increasingly fine 
anatomical resolution is available with functional magnetic resonance imaging 
(fMRI). However, fMRI is an indirect measure of neuronal electrical activity 
whose temporal resolution is too gross to resolve the rapidly shifting patterns of 
activity that are characteristic of actual neurophysiological processes. In contrast, 
electroencephalography (EEG) and event-related potential (ERP) methods have a 
temporal resolution typically in the one to five millisecond range, depending on 
the A/D rate. For simplicity the term EEG is used here in a general sense to refer 
both to recordings of brain electrical activity and, except where noted, to 
recordings of brain magnetic activity called magnetoencephalograms or MEGs. 
The nature of MEG recording technology and the relative strengths and 
weaknesses of EEG versus MEG approaches have been reviewed elsewhere 
(Cohen & Cuffin, 1991; Leahy et al., 1998; Williamson & Kaufman, 1987). From 
a broad perspective that considers all neuroimaging modalities, the differences 
between EEG and MEG are slight relative to their similarities. The sensitivity of 
the EEG to changes in mental activity has been recognized since Hans Berger 
reported a decrease in the amplitude of the dominant (alpha) rhythm of the EEG 
during mental arithmetic (Berger, 1929). In addition to the type of tonic alterations 
in brain electrical activity reported by Berger, EEG measurements of phasic, 
stimulus-related changes in brain activity (such as ERPs) are well-suited for 
measuring sub-second component processes of sensory, motor, and cognitive 
processes (Hillyard & Picton, 1987; Regan, 1989). Measurements of the 
coherence, correlation, or covariance of EEG time series from different electrode 
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sites help generate hypotheses about the functional networks that form between 
different cortical regions during these processes. The temporal resolution and 
sensitivity of the EEG make it an ideal complement to fMRI. However, the spatial 
detail obtained in most EEG studies has been so coarse that it has only been 
possible to meaningfully interpret EEGs with respect to underlying functional 
neuroanatomy at the level of entire cortical lobes, if at all. The overriding 
limitations in this regard are that the EEG is most often recorded at a small 
number of scalp sites and spatial deblurring methods are not used to compensate 
for distortion due to conduction through the skull. Thus, even though the ability to 
infer the three-dimensional distribution of electrical sources in the brain from 
scalp EEG recordings has fundamental physical limits, the amount of spatial detail 
that can be gleaned from the scalp-recorded EEGs is often not appreciated. 



2. INCREASING SPATIAL DETAIL 

Recording from more electrodes is the first requirement for extracting more 
spatial detailed from scalp-recorded EEGs. The nineteen-channel "10/20" montage 
of electrode placement commonly used in clinical EEG recordings has an inter- 
electrode distance of about 6 cm on a typical adult head (Jasper, 1958). This 
spacing may be sufficient for detecting signs of gross pathology or for 
differentiating the overall topography of ERP components, but it is insufficient for 
resolving finer grained topographical details that may be of importance in 
studying cognition. By increasing the number of electrodes to over 100, average 
inter-electrode distances of about 2.5 cm can be obtained on a typical adult head. 
This distance is within the the typical cortex-to-scalp point spread function — i.e., 
the size of the scalp representation of a small, discrete cortical source (Gevins et 
al., 1990). 

For electrical (but not magnetic) recordings, the usefulness of such increased 
spatial sampling remains limited by the distortion of neuronal potentials as they 
are passively conducted through the highly resistive skull (Gevins et al., 1991; van 
den Broek et al., 1998). This distortion reflects a spatial low-pass filtering, which 
causes a blurring of the potential distribution at the scalp. In recent years a number 
of spatial enhancement methods have been developed for reducing this distortion. 

The simplest and most widely used of these methods is the spatial Laplacian 
operator, usually referred to as the Laplacian Derivation (LD). It is computed as 
the second derivative in space of the potential field at each electrode. The LD is 
proportional to the current entering and exiting the scalp at each electrode site 
(Nunez, 1981; Nunez & Pilgreen, 1991), and is independent of the location of the 
reference electrode used for recording. The LD is relatively insensitive to signals 
that are common to the local group of electrodes used in its computation and is, 
therefore, relatively more sensitive to high spatial frequency local cortical 
potentials. A simple method of computing the LD assumes that electrodes are 
equidistant and at right angles to each other, an approximation that is only 
reasonable at a few scalp locations such as the vertex. A more accurate approach 
is based on measuring the actual three-dimensional position of the electrodes and 
using 3D spline functions to compute the LD over the actual shape of a subject’s 
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head (Le et al., 1994). The main shortcoming of the LD is that it unrealistically 
assumes that the skull has the same thickness and conductivity everywhere, which 
limits the improvement in spatial detail that the method can achieve. 

This shortcoming of the LD can be ameliorated by using a realistic model of 
each subject's head to locally correct the EEG potential field for distortion 
resulting from conduction to the scalp. One such method is called "Finite Element 
Deblurring." It provides a computational estimate of the electrical potential field 
near the cortical surface by using a realistic mathematical model of volume 
conduction through the skull and scalp to downwardly project scalp-recorded 
signals (Gevins et al., 1991, 1994; Le & Gevins, 1993). Each subject's MRI is 
used to construct a realistic model of his or her head in the form of many small 
tetrahedral elements representing the tissues of scalp, skull, and brain. By 
assigning each tissue a conductivity value, the potential at all finite element 
vertices can be calculated using Poisson's equation. Given that the actual 
conductivity value of each of these finite elements is unknown, a constant value is 
used for the ratio of scalp to skull conductivity; the conductivity of each finite 
element is set by multiplying this constant by the local tissue thickness as 
determined from the MRI. Thus, even though true local conductivity is unknown, 
the procedure is well behaved with respect to this source of uncertainty, because it 
successfully accounts for relative conductivity variation due to regional 
differences in scalp and skull thickness. 

The Deblurring method has been shown to be reliable and more accurate than 
the LD (Gevins et al., 1991, 1994), an improvement that occurs at the expense of 
obtaining and processing each subject's structural MRI. Although Deblurring can 
substantially improve the spatial detail provided by scalp recorded EEG, it does 
not usually provide additional information about the location of generating 
sources. Nevertheless, the improved spatial detail facilitates formation of more 
specific hypotheses about the distribution of active cortical areas during a 
cognitive task. Blurring of brain signals by the skull can largely be avoided by 
recording the magnetic rather than the electrical fields of the brain, because the 
skull has no effect on magnetic field topography. However, this transparency does 
not eliminate the need for utilizing a high density of sensors to accurately map the 
spatial topography of brain magnetic fields. Furthermore, the problems of 
localizing generator sources are equally severe for MEG as they are for EEG (see 
below). Further, the cost of MEG technology is more than an order of magnitude 
greater than that required for EEG studies, and the associated infrastructure 
required to perform MEG studies is more complex and inflexible. Thus, for most 
laboratories and for some applications, such as those in which a subject's head 
cannot be immobilized for long term monitoring or ambulatory recordings, MEG 
does not provide a viable alternative to EEG recordings. 
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3. VERIFICATION AND APPLICATION OF HIGH 

RESOLUTION EEG IN EXPERIMENTAL STUDIES 

Exploratory studies of Deblurring and other high resolution EEG techniques 
focused on the spatial enhancement of sensory ERPs, where a great deal of a priori 
knowledge exists concerning their underlying neural generators (Gevins et al., 
1994). These studies demonstrated that Deblurred somatosensory responses 
isolated activity to the region of the central sulcus. Similar localization is obtained 
with movement-related potentials. Figure 1 illustrates the results of Deblurring 
potentials locked in time to a button press response made with the right hand. The 
major foci of activity occur in the contralateral (left hemisphere) in the 
somatomotor region of the pre- and post-central gyri. Demonstrations like these 
help verify the reasonableness of the approach. A better validation is obtained by 
comparison of the Deblurred potentials with subdural grid recordings in epileptic 
patients undergoing evaluation for ablative surgery. To date these validation 
studies have produced a reasonable degree of agreement between the Deblurred 
potentials and those measured directly at the cortical surface (Gevins et al., 1994). 

Recent developments suggest that high-resolution EEG methods are useful 
tools in the experimental analysis of higher-order brain functions, and functional 
localization of cognitive processes inferred from spatially enhanced and 
anatomically registered neurophysiological measurements can be compared with 
the results of lesion studies and other neuroimaging techniques. As a complement 
to these approaches, the fine-grain temporal resolution of ERP measurements, in 
combination with improved topographic detail, adds valuable insights gained by 
characterizing both the regionalization of functions and the sub-second dynamics 
of their engagement. For example, spatial enhancement of EEGs related to 
component processes in reading has yielded results that are highly consistent with 
current knowledge of the functional neuroanatomy thought to be involved with 
visual pattern recognition and language functions (Gevins et al., 1995). 
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Figure 1. Deblurred Movement-Related Potentials. Evoked potentials, time-locked to the onset of a 
button press response made with the right middle finger, were recorded from a high density (128 
channel) electrode montage attached to the scalp. The blurring effect of the scalp and skull were 
mathematically removed using the Deblurring method, and the resulting spatially sharpened data 
were projected onto the cortical surface, which was constructed from the subject’s MRI. Activity at 
two instants prior to the button press (top) and two instants after the button press (bottom) are 
plotted. Both pre-and post-response activation are strongly lateralized to the left hemisphere-the 
hemisphere contralateral to the hand used to make the response. This figure shows lateralized 
activation of the precentral gyrus prior to and immediately after a button press response is made. 
Approximately 250 milliseconds after the response, the focus of activation moves to the postcentral 
gyrus (after Gevins, et al., 1999). 



Modern EEG methods have also been used to study sub-second and multi- 
second distributed neural processes associated with working memory, the 
cognitive function of creating a temporary internal representation of information 
during focused thought (Gevins et al., 1996; McEvoy et al., 1998). In task 
conditions that placed a high load on working memory functions, subjects were 
asked to decide if the stimulus on each trial matched either the verbal identity or 
the spatial location of a stimulus occurring three trials (13.5 second) previously. 
This required subjects to concentrate on maintaining a sequence of three letter 
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names or three spatial locations concurrently ; they had to update that sequence on 
each trial by remembering the most recent stimulus and could drop the stimulus 
from four trials back. In two corresponding control conditions, only the verbal 
identity or spatial location of the first stimulus had to be remembered. Both spatial 
and verbal working memory tasks produced highly localized momentary 
modulation of ERPs over prefrontal cortical areas relative to control conditions, 
with Deblurred voltage maxima approximately over Brodmann’s areas 9, 45, and 
46 (Figure 2). These brief (~50 and to 200 milliseconds) events occurred in 
parallel with a sustained ERP wave, maximal over the superior parietal lobe and 
the supramarginal gyrus, with a slight right-hemisphere predominance. It began 
-200 milliseconds after stimulus onset, returned to near baseline by -600 
milliseconds post-stimulus in control conditions and was sustained up to -1 
second or longer in the WM conditions. The sub-second ERP effects occurred in 
conjunction with multi-second changes in the ongoing EEG, of which the theta 
band power focused over midline frontal cortex is shown in Figure 2 (Gevins et 
al., 1997; Smith et al., 1999). These EEG findings may provide the first direct 
evidence in a single experiment supporting the idea that the various types of 
attention are associated with neural processes with distinct time courses in distinct 
neuronal populations. The increased theta band power may be a marker of the 
continuous focused attention required to perform the task and reflect engagement 
of the anterior cingulate gyrus, a conjecture supported by dipole modeling (Gevins 
et al., 1997). In contrast, the momentary attention required for scanning and 
updating the representations of working memory may be indexed by increased 
ERP amplitude peaks over lateralized regions of dorsolateral prefrontal cortex, 
while maintenance of a representation of the stimuli being remembered may be 
reflected in the parietally maximal ERP wave and other concomitant changes in 
the EEG (Gevins et al., 1996, 1997). 
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Figure 2. Deblurred Event-Related Potentials and Ongoing EEG Related to Sustained Focused 
Attention. High resolution EEG methods have made it possible to simultaneously measure both sub- 
second phasic and multi-second tonic regional brain activity during performance of cognitive tasks. 
In this experiment a sequence of increased sub-second ERP peaks and waves was observed over 
frontal (first and second columns) and parietal (third column) cortices during a difficult working 
memory task, in comparison to control conditions with lower working memory requirements. These 
sub-second changes in the working memory tasks were accompanied by longer lasting (4 second) 
increases in ongoing EEG theta band power (rightmost column). These EEG findings suggest that 
various types of attention are associated with neural processes that have distinct time courses in 
distinct neuronal populations. Amplitude scale is constant across experimental conditions within 
each column; ERP scale is voltage, EEG scale is z-scored spectral power (after Gevins et al., 1995). 

These EEG and ERP measurements of sustained focused attention and working 
memory have high test-retest reliability (McEvoy et al., 2000), and vary in 
predictable ways across the lifespan (McEvoy et al., 2001). They also can be 
highly predictive of individual differences in cognitive ability and cognitive style 
as defined by traditional psychometric instruments (Gevins & Smith, 2000), and 
are also sensitive to transient cognitive impairments that can be produced by 
fatigue or some medications (Gevins & Smith, 1999; Gevins et al., 2001). Such 
measures might therefore serve an important role in clinical assessment techniques 
that incorporate both behavioural and neurophysiological indices. 
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4. IDENTIFYING THE GENERATORS OF EEG 

Neither the Laplacian Derivation, nor more advanced EEG spatial 
enhancement algorithms such as Deblurring, nor MEG recordings, provide any 
conclusive three-dimensional information about where the source of a scalp- 
recorded signal lies in the brain. In some cases, such as when healthy subjects 
perform difficult cognitive tasks and strong signals are recorded over areas of 
association cortex (i.e., dorsolateral prefrontal, superior and inferior parietal, 
inferotemporal and lateral temporal), the hypothesis that EEG potentials are 
generated in these areas is the most plausible. However, counter-examples can 
always be presented. In addition to visual examination of the potential field 
distribution, "dipole modeling" provides another method for generating 
hypotheses concerning the neuroanatomical loci responsible for generating 
neuroelectric events measured at the scalp (Fender, 1987; Scherg & Von Cramon, 
1985). Dipole modeling uses iterative numerical methods to fit a mathematical 
representation of a focal, dipolar current source, or collection of such sources, to 
an observed scalp-recorded EEG or MEG field. 

Source modeling does not, in general, provide a unique or necessarily 
physically correct answer about where in the brain activity recorded at the scalp is 
generated. This is so because solving for the source of an EEG or MEG 
distribution recorded at the scalp is a mathematically ill-conditioned "inverse 
problem" that has no unique solution; additional information and/or assumptions 
are required to choose among candidate source models. Although some of this a 
priori information is obvious (i.e., that the potentials must arise from the space 
occupied by the brain), other assumptions border on presupposing unknown 
information (i.e., that the potentials arise only from the cortex or that the number 
of active cortical areas is known). 

One simple, convenient, and potentially clinically useful approach for 
potentials elicited by simple sensory stimulation is to assume that the scalp 
potential pattern arises from a single point dipole source (see Figure 3). Although 
not anatomically or physiologically realistic, such simple models can sometimes 
be useful for locating the center of mass of primary sensory cortex and hence 
major functional landmarks such as the central sulcus. When justified by simple 
voltage topography (e.g., Figure 4), models of this sort can also be useful for 
generating initial hypotheses about the possible sources underlying other 
phenomena. 
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Figure 3. Localization of an EEG dipole model in the somatosensory cortex of the right hemisphere 
from scalp-recorded data evoked in response to transient electrical stimulation of the left index 
finger. This popular type of source generator localization modeling produces anatomically plausible 
results in the case of simple sensory stimulation (after Gevins, et al., 1993). 




Figure 4. Deblurred frontal midline theta EEG activity and localization of corresponding source 
model in the region of the anterior cingulate cortex. Topographic data correspond to the difficult 
working memory task condition depicted in Figure 2 (Gevins et al., 1997). The data were processed 
with the Deblurring method, and the spatially sharpened results were projected onto the cortical 
surface, which was constructed from the subject’s MRl. The upwards-oriented arrow superimposed 
on the midline sagittal image depicts the localization of a point dipole source model for these data 
(after Gevins, et al., 1999). 

Most complex scalp-recorded neurophysiological phenomena are poorly 
approximated by a single dipole source model. Obtaining estimates of the strength 
and 3D locations of the underlying neuronal generators when there are multiple, 
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time-overlapped active sources has widely recognized practical and theoretical 
difficulties (Miltner et al., 1994). An intensive effort is being allocated to the 
development of improved methods for source analysis for electrical phenomenon 
that are likely to arise from multiple and/or distributed sources (Gorodnitsky et al., 
1995; Grave de Peralta-Menendez & Gonzalez- Andino, 1998; Koles, 1998; 
Tesche et al., 1995; Wang et al., 1993). Even so, regardless of which method is 
used to formulate them, such source generator hypotheses must ultimately be 
independently verified. In rare cases, this might be done in patient populations in 
the context of invasive recordings performed for clinical diagnostic purposes. 
More commonly, another type of imaging modality, such as fMRI, has to be 
employed. One promising approach to this issue is to use information about the 
cortical regions activated by a task as mapped by fMRI to constrain source 
models, and to derive information about the spatiotemporal dynamics of those 
sources from ERP measurements (George et al., 1995; Heinze et al., 1994; 
Mangun et al., 1998; Sereno, 1998; Simpson et al., 1995). 



5. DISTRIBUTED FUNCTIONAL NETWORKS OF 
SIMPLE COGNITIVE TASKS 

Independently of whether definitive knowledge of source configurations exists, 
changes in the spatial distribution of EEG phenomena can be used to characterize 
the neural dynamics of thought processes. Even the simplest cognitive tasks 
require the functional coordination of a large number of widely distributed 
specialized brain systems. A simple response to a sensory stimulus involves the 
coordination of sensory, association and related areas that prepare for, register and 
analyze the stimulus, the motor systems that prepare for and execute the response, 
and other distributed neuronal networks. These distributed networks serve to 
allocate and direct attentional resources to the stimulus, to relate the stimulus to 
internal representations of the self and environment in order to decide what action 
to take, to initiate or inhibit the behavioral response, and to update internal 
representations after receiving feedback about the result of the action. In the 
ongoing EEG, hypotheses about functional interactions between cortical regions 
are sometimes drawn from measurements of statistical inter-relationships between 
time series recorded at different sites. These can be quantified by various 
measures of spectral, wave shape, or information-theoretic similarity, including: 
spectral coherence (Walter, 1963), correlation (Brazier & Casby, 1952; Gevins et 
al., 1981, 1983; Livanov, 1977), covariance (Gevins et al., 1987; Gevins et al., 
1989a; Gevins et al., 1989b), information measures (Callaway & Harris, 1974; 
Mars & Lopes da Silva, 1987), nonlinear regression (Lopes da Silva et al., 1989) 
and multichannel time- varying autoregressive modeling (Gersch, 1987). 

Some of the above methods can be used to characterize the spatiotemporal 
relationships of sub-second ERP components. Since the ERP waveform delineates 
the time course of event-related mass neural activity of a neuronal population, the 
coordination of two or more populations during task performance should be 
signaled by a consistent relationship between the morphology of the ERP 
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waveforms emitted by these populations, with consistent time delay (Gevins & 
Bressler, 1988). If the relationships are linear, as they often appear to be, this 
coordinated activity might be measured by the lagged correlation or covariance 
between the ERPs, or segments of ERPs, from different regions (Gevins et al., 
1987; Gevins et al., 1989a; Gevins et al., 1989b). One such measure of this type of 
process is referred to as an Event-Related Potential Covariance (ERPC). Of 
course, a significant covariance of this type is only a measure of statistical 
association, and does not map the actual neuronal pathways of interaction between 
functionally related populations. Studies of the neurogenesis of ERPCs are still in 
their infancy (Bressler et al., 1993; Gevins et al., 1994), and any interpretations of 
ERPCs in terms of the underlying neural processes that generate them must thus 
be made very cautiously. However, it is noteworthy that ERPC results to date have 
been highly consistent with the known large-scale functional neuroanatomy of 
frontal, parietal, and temporal association cortices. The ERPCs are beginning to 
provide fascinating glimpses of the complex, rapidly shifting distributed neuronal 
processes that underlie simple cognitive tasks. 

The ERPC technique has yielded its most interesting results as a tool for 
studying preparatory attentional networks, the changes in brain activity associated 
with readiness for an impending event or action. For example, subjects in one 
experiment performed a task that required graded finger pressure responses with 
either the right or left hand proportional to visual numeric stimuli from 1 to 9 
(Gevins et al., 1987; Gevins et al., 1989a; Gevins et al., 1989b). The hand to be 
used was cued one second before the stimulus. A 375-millisecond ERPC analysis 
window spanned the interval preceding the stimulus number in order to measure 
how ERP patterns differed according to the hand subjects expected to use. Figure 
5 shows right-hand preparatory ERPCs for seven subjects for those trials for 
which the response (-0.5 to 1 second later) was subsequently either accurate or 
inaccurate. The set of subsequently accurate trials is characterized by covariances 
of the left prefrontal electrode with electrodes overlying the same motor, 
somatosensory and parietal areas that were involved in actual response executions 
(simultaneous measurement of flexor digitori muscle activity showed that the 
finger that would subsequently respond was not active during the preparatory 
interval). The preparatory patterns preceding inaccurate responses differed 
markedly from those preceding accurate responses, with fewer ERPCs between 
the left frontal site and other electrodes. Such results suggest that one important 
role of frontal lobe integrative mechanisms is the anticipatory scheduling and 
coordination of the activation of those specialized brain regions that will 
participate in an upcoming cognitive event. 
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Figure 5. Preparatory Event-Related Potential Covariance (ERPC) patterns preceding accurate and 
inaccurate responses. ERPCs involving left frontal, midline precentral and left central and parietal 
electrode sites are prominent in patterns preceding (by 0.5 to 1 second) accurate responses (left). The 
number and magnitude of ERPCs are smaller preceding inaccurate responses (right) (after Gevins, et 
al., 1995). 

Finally, an obvious but often unappreciated feature of EEG technology is 
worth mentioning, namely its extreme compactness and simplicity. This fact has 
important practical considerations, which frequently fail to be considered in 
scientific discussions of brain mapping methods. For example, the EEG has the 
potential to serve as a sensitive, low-cost, and portable monitor of cognition for 
clinical assessment and other applications (Gevins, 1998; Gevins et al., 1998). The 
compactness of EEG technology also means that, unlike all other functional 
neuroimaging modalities (which require massive machinery, large teams of 
technicians, and complete immobilization of the subject) EEGs can be collected 
from an ambulatory subject who is literally wearing the entire recording apparatus. 
This feature of EEGs will facilitate research into the as yet uncharted territory of 
how brains think when performing everyday activities in the real world (Gevins et 
al., 1995; Smith et al., 2001). 



6 . CONCLUSIONS 

The neurophysiology of mentation involves rapid coordination of processes in 
widely distributed cortical and subcortical areas. The electrical signals that 
accompany higher cognitive functions are subtle, spatially complex, and change 
both in a tonic multi-second fashion and phasically in sub-second intervals in 
response to environmental demands and internal representations of environment 
and self. No single brain imaging technology is currently capable of providing 
both near millimeter precision in localizing regions of activated tissue and sub- 
second temporal precision for characterizing changes in patterns of activation over 
time. However, by combining several technologies, it seems possible to achieve 
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this fine degree of spatiotemporal resolution. Modem high-resolution EEG is 
especially well suited for monitoring rapidly changing regional patterns of 
neuronal activation accompanying purposive behaviors, while fMRI seems ideal 
for precisely determining their three-dimensional localization and distribution. 
Current research is seeking to determine how to combine EEG and fMRI data 
from the same subjects doing the same tasks. 
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1. INTRODUCTION 

In contrast to the alpha rhythm, which is the dominant large scale oscillation in 
the human EEG, most of what is known about the theta rhythm stems from animal 
research. Therefore, the most important properties and the functional meaning of 
the hippocampal EEG will be reviewed first. Research with human subjects then 
will be reported, which indicates that an event-related increase in theta power is 
associated with increasing memory demands in a similar way as was found in 
animal research for the hippocampal EEG. Finally, it will be shown that sleep is 
important for memory consolidation. The involvement of the hippocampal 
formation will be discussed. 



2. HIPPOCAMPAL THETA: SOME BASIC FACTS 

The frequency of theta recorded from the hippocampus of lower mammals can 
vary between about 3.5 to 12 Hz (Lopes da Silva, 1992) and, therefore, shows a 
much wider frequency range than in humans where theta lies within a range of 
about 4 to 7.5 Hz. A regular oscillatory pattern can be observed in the 
hippocampal EEG (which also is termed Rhythmic Slow Activity or RSA) if 
animals make voluntary movements (Vanderwolf, 1992; Vanderwolf & Robinson, 
1981), during exploratory behavior (Buzsaki et al., 1994) and also in REM sleep 
(Winson, 1990). In the absence of exploratory behavior (e.g., in slow wave sleep, 
SWS, or during alert immobility) the hippocampal EEG shows slow irregular 
activity in the delta frequency range (of about 1.5 to 4 Hz), which has received 
different names (Irregular Slow Activity, ISA; Large Irregular Activity, LIA; 
Sharp Waves, SPW) but will be called here SPW (Buzsaki et al., 1994). 

Theta is usually recorded from microelectrodes implanted in the CAl or the 
dentate layers of the hippocampus. It is induced from the septum, which serves as 
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a primary pacemaker (Petsche et al., 1962; for reviews see Buzsaki et al., 1983 
and Miller, 1991). The extracellular currents that underlie theta activity reflect 
synchronous fluctuations in the membrane potentials of pyramidal and granule 
cells (Bland, 1986, Buzsaki et al., 1983). Two facts are of crucial importance: (1) 
Most excitatory (pyramidal, granula) cells are silent during theta activity, only 
intemeurons are bursting in theta (and gamma) frequency, paced by septal cells. 
(2) Theta operates to silence most principal cells (pyramidal, granule) by keeping 
their membrane voltage below but at the same time close to the firing threshold. 

During theta activity — induced by septal neurons — a few entorhinal afferents 
and/or granula cells are sufficient to selectively activate principal cells that start 
bursting in the theta frequency range. When theta activity ceases, SPWs appear. 
Although there is high overlap, some brain regions are differentially involved in 
theta and SPW activity as shown schematically in Figure 1. Note that superficial 
layers of the entorhinal cortex that are an important input structure for the 
hippocampus are already paced within theta frequency. Deep layers that send 
fibers to other parts of the neocortex display SPWs after theta ceases. It has been 
emphasized that SPWs develop from the cells that fire in synchrony with the theta 
rhythm (Buzaki, 1996). 

2.1 HIPPOCAMPAL ACTIVITY AND BEHAVIOR 

It is well known that the frequency of theta is related to different types of 
behavior. Voluntary movements that are observed during exploratory behavior 
(e.g., walking, jumping, etc.) can be characterized by a highly regular theta 
oscillation, which is termed type 1 theta. In rodents, type 1 theta ranges from 
about 6.5 to 12 Hz. A somewhat slower and more irregular theta frequency can be 
observed during a state of immobile alertness, if sensory stimuli are presented 
while animals are immobile but alert and, thus, in a state of arousal (Montoya et 
al., 1989). This type 2 theta frequency varies within a range of about 4-9 Hz. A 
third type of behavior characterized by intermittent SPWs (with a duration of 
about 40 to 120 milliseconds and a frequency of about 0.02 to 3.5 Hz) can be 
observed during sleep and ’’automatic” motor patterns. Thus, frequency and 
regularity of hippocampal activity distinguish among different types of behavioral 
states such as slow wave sleep (SWS), exploratory behavior, and awake 
immobility (Buzsaki et al., 1983). 
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Shaded regions typically display theta activity (during exploratofy behavior and REM sleep 




When theta oscillations disappear, SPW‘s (sharp waves) appear (shaded areas), during 
automatic movements and SWS sleep 




Figure L Interplay between theta and sharp waves (SPWs, in the delta frequency range). Theta is an 
inhibitory rhythm that operates to filter out highly selective excitatory network connections (Figure 
la). These connections are assumed to form the neuronal basis for the encoding of new (explicit) 
information. When theta ceases, they form the basis for the appearance of SPWs, which may be the 
electrophysiological correlate of memory consolidation (after Buzsaki, 1996). 

Attentional demands are absent during SWS (when hippocampal activity is 
slowest and completely irregular) but highest during exploratory behavior (when 
theta frequency is highest and most regular). It is reasonable to assume that 
attention increases from SWS, awake immobility, automatic motor patterns (such 
as drinking, eating, face washing, and grooming), high alertness during 
immobility, and finally to voluntary movements during exploratory behavior. 
Hence, a perfect association between the frequency of the hippocampal EEG 
(within a range of 0.02-12 Hz) and attentional demands can be detected. In 
addition, with increasing attention, the hippocampal EEG becomes more regular 
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until EEG-theta becomes synchronized within the small range of peak theta 
frequency (cf. Leung et al., 1982; Buzsaki et al., 1983). 

Theta phase as well as theta power and frequency are related to behavior. 
Within the pyramidal cell layer of the rat hippocampus, cells have been found that 
unlike theta cells respond with a more complex pattern of spikes. O’Keefe and 
Dostrovsky (1971) observed that this cell type responds to the animal's location in 
the environment. Phase locked to the concurrent EEG-theta, these ‘place cells’ fire 
several bursts of spikes as the rat runs through a particular location in its 
environment, which is called the place field of that cell. The most important result 
is that the firing of place cells began at a particular phase of theta frequency as the 
rat entered the place field. Within a particular place field the angle of theta phase 
was related to a particular spatial location (O'Keefe & Recce, 1993; O'Keefe, 
1993). Thus, when considering the place field a specific stimulus, theta frequency 
is phase locked to the appearance of this stimulus (cf. Givens, 1996). 

2.2 HIPPOCAMPAL THETA REFLECTS MEMORY 
PROCESSES 

The hippocampal EEG distinguishes between different behavioral states, but 
what exactly is the functional meaning of the different “types” of theta? A first 
hint comes from the fact that type 1 theta is associated with a behavioral state (i.e., 
exploratory behavior) in which not only attentional but also memory demands are 
highest. It is a well established finding that the hippocampal formation together 
with a complex structure of other brain regions (Squire, 1992) is important for the 
encoding and possibly also retrieval of new information (Scoville & Milner, 
1957): Hence, hippocampal theta rhythm might be related to these types of 
memory processes. Strong evidence for this hypothesis has come from studies that 
have documented a preference for long-term potentiation (LTP) to occur in the 
hippocampal formation, and that theta activity induces or at least enhances LTP 
(Larson et al., 1986, Greenstein et al., 1988; Maren et al., 1994). Because LTP is 
considered the most important electrophysiological memory mechanism for the 
encoding of new information, experimental evidence for a functional relation 
between LTP and theta provides an important argument supporting theta as an 
electrophysiological correlate of working memory. Indeed, the induction of LTP is 
optimal with stimulation patterns that mimic the theta rhythm (Larson et al., 
1986), whereas not repeated and short stimulations are usually ineffective. In 
addition, the induction of LTP depends on the phase of theta rhythm, and the 
strength of the induced LTP was found to increase linearly with increasing theta 
power (Maren et al., 1994, Figure 6). These findings provide consistent evidence 
that hippocampal theta is related to the encoding of new information, just as LTP 
is. 

Theories and reviews about hippocampal theta and memory are available with 
respect to the question whether theta activity can be analyzed in the human scalp 
EEG (Buzsaki et al., 1994; Lisman & Idiart, 1995; O’Keefe & Burgess, 1999). 
Miller’s (1991) theory of resonant loops is of particular importance. He assumes a 
rhythmic interplay between the hippocampus and cortex. The human scalp EEG 
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primarily reflects cortical activity. However, theta frequency in the hippocampus, 
deep inside the brain, would be biophysically difficult to detect from scalp 
electrodes. If hippocampal theta is induced into the cortex, given the above, it 
should be possible to detect memory related changes in theta activity even in the 
human scalp EEG. 



3. THETA ACTIVITY IN THE HUMAN SCALP EEG 

The traditional view holds that theta frequency lies within a fixed range of 
about 4 to 8 Hz just below alpha frequency (about 8-13 Hz). A fixed frequency 
band would be justified only if it can be demonstrated that theta does not vary 
among subjects. Thus, before considering findings from the human scalp EEG it is 
important to address the following two questions: (1) Is there a physiological 
criterion that specifies which frequency marks the transition between alpha and 
theta? (2) Does theta frequency vary in a similar way as has been demonstrated for 
alpha frequency (Klimesch, 1999)7 

The answer to the first question is surprisingly easy. It is a well established fact 
that with increasing task demands theta synchronizes, whereas alpha 
desynchronizes. If EEG power in a resting condition is compared with a test 
condition, alpha power becomes suppressed (desynchronizes), whereas theta 
power increases (synchronizes). The specific frequency in the power spectrum that 
marks the transition between an event-related increase in theta and a decrease in 
alpha power can be considered the individual transition frequency (TF) between 
the alpha and theta band for individual subjects. 

When using this method to estimate TF, the second question can also be 
answered: TF shows a large interindividual variability (ranging from about 4 to 7 
Hz), which is linked to the interindividual variability of alpha peak frequency. 
Preliminary evidence for a covariation between theta and alpha frequency was 
already found by Klimesch et al. (1994). Additional evidence for such a 
covariation has also been obtained using the following method (Klimesch et al., 
1996). First, power spectra for the reference and test intervals were calculated for 
each subject and averaged over all trials and all leads. Then, the frequency of the 
transition region between the theta and alpha band was determined within a 
frequency window of 3. 5-7.5 Hz. For those few subjects, who showed an 
asymmetric alpha desynchronization (with no desynchronization in the lower 
alpha) and therefore failed to show an intersection (in the range of 3. 5-7.5 Hz), the 
transition between the theta and alpha band was considered that frequency where 
the difference between the test and the reference interval reached a minimum. 
Klimesch et al. (1996) found that Spearman’s rank correlation between alpha peak 
frequency and TF yielded a significant value of rho=0.64 (p<.02), with similar 
results reported by Doppelmayr et al. (1998a). These findings document that theta 
varies as a function of alpha frequency and suggest to use individual alpha 
frequency as a common reference point for defining different frequency bands 
including theta. For theta, the individual determination of frequency bands may 
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even be more important, because the effects of theta synchronization are otherwise 
masked by alpha desynchronization particularly in the range of the TF. 

In general, TF lies about 4 Hz below the individually determined alpha 
frequency. As an example, in a sample of 10 subjects with a mean alpha frequency 
of 10.7 Hz, TF lies at 6.7 Hz (Klimesch et al., 1996). If a bandwidth of 4 Hz 
would be assumed, theta frequency would cover a range of 2.7 to 6.7 Hz. This 
produces an estimate that appears too low for the lower frequency boundary of 2.7 
Hz since this frequency is considered to belong to the delta frequency range. To 
avoid overlap with delta, theta frequency can be defined as that band with a width 
of just 2 Hz that falls below TF. Thus, in our example theta is a band of 4.7 to 6.7 
Hz. All of the findings reported below are based on individually adjusted theta 
bands of 2 Hz width — results that suggest the theta frequency range in humans 
may be much smaller than originally assumed. 

3.1 THETA ACTIVITY IN THE HUMAN SCALP EEC 
AND MEMORY PROCESSES 

A well known procedure to measure event-related changes in band power is 
based on a method originally proposed by Pfurtscheller and Aranibar (1977). 
Event-related band power changes are defined as the percentage of a decrease 
termed event-related desynchronization (ERD) or increase termed event-related 
synchronization (ERS) in band power during a test with respect to a reference 
interval. The measurement of ERD/ERS is done in several steps. First, the EEG is 
band pass filtered within defined frequency bands, with the filtered data squared 
and then averaged within consecutive time intervals (e.g., 125 milliseconds). 
Second, the obtained data are averaged over the number of epochs. Third, band 
power changes are expressed as the percentage of a decrease or increase in band 
power during a test as compared to a reference interval by using the following 
formula: ERD = ((band power reference - band power test)/(band power 
reference))* 100. Note that desynchronization is reflected by positive ERD values, 
whereas synchronization is reflected by positive ERS (or negative ERD) values. 

The studies summarized below were designed to investigate the question of 
whether an event-related increase in theta power selectively reflects the successful 
encoding and/or retrieval of new episodic information. This hypothesis was 
derived from reviews focusing on findings in functional neuroanatomy, 
electrophysiology, amnesia, and memory research; it was proposed that the 
hippocampal theta rhythm primarily reflects the encoding and retrieval of episodic 
memory (Klimesch, 1995, 1996, 1999). 

The fact that theta power increases in a large variety of different tasks seems to 
contradict the suggested h 3 q)othesis of a specific relationship between theta and 
the processing of new information (Arnolds et al., 1980; Schacter, 1977). 
However, the processing of new information is in some way and extent necessary 
for the performance of almost any type of task. Of particular importance is the 
experimental control of unspecific factors (such as attentional demands, task 
difficulty, and cognitive load) that usually accompany the processing of new 
information. The following studies compared different types of memory demands 
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(such as episodic and semantic memory) and processes (such as incidental and 
episodic encoding, successful and unsuccessful encoding and/or retrieval). 

In a first study, the relationship between theta synchronization, alpha 
desynchronization, episodic and semantic memory demands was investigated 
(Klimesch et al., 1994). The experimental design consisted of two parts. Subjects 
first performed a semantic congruency task in which they judged whether the 
sequentially presented words of concept-feature pairs (such as "eagle-claws” or 
"pea-huge”) were congruent. They were then asked to perform an unexpected 
episodic recognition task. This approach helped to prevent subjects from using 
semantic encoding strategies and to increase episodic memory demands. In the 
episodic task, the same word pairs were presented together with new distractors 
(generated by re-pairing the already known concept-feature words). In order to 
perform the task correctly, subjects had to know whether or not a particular 
concept-feature pair was already presented during the semantic task. Because 
distractors were semantically similar and generated by re-pairing previously 
presented words, subjects were able to give a correct response only if new 
episodic information (represented by a specific combination of a concept and 
feature word) was actually stored in memory. The results of a reaction time study 
with the same experimental paradigm (Kroll & Klimesch, 1992, Experiment 4) 
indicated that semantic features speeded up semantic but slowed down episodic 
decision times. Thus, semantic and episodic memory processes can well be 
differentiated behaviorally (for a review, see Klimesch, 1994). 

Furthermore, as pairs of items are presented, the episodic and semantic task 
could be performed only after the second item of a pair (i.e., the feature) was 
presented. The critical electrophysiological issue was to compare the amount of 
theta synchronization during the presentation of the concept and feature word in 
the episodic and semantic task. Only correctly identified concept feature pairs 
were analyzed. The results demonstrated that only in the episodic task and only 
when the feature word was processed was the expected increase in theta power 
observed. During the processing of the concept words, theta power evinced almost 
identical values in the semantic as well as the episodic task. Because the same 
words were presented in both tasks and all other variables were kept constant 
(exposure time, length of interstimulus interval, etc.), the findings support the 
hypothesis of a specific relationship between memory (for new episodic 
information) and theta synchronization. The fact that the theta band responds 
selectively to episodic task demands is also demonstrated by the finding that the 
lower and the upper alpha band show quite different results. Whereas event- 
related changes in the lower alpha band were comparatively small, the upper alpha 
band shows a much larger degree of desynchronization during the semantic as 
compared the episodic task. Thus, there is dissociation between theta 
synchronization, which is maximal during the processing of new information, and 
upper alpha desynchronization, which is maximal during the processing of 
semantic information. 

Similar results were obtained in a study with three experiments in which words 
were presented and had to be encoded into working memory (Klimesch et al., 
1997b). In all experiments only at occipital sites and only during the first 500 
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milliseconds after a word was presented visually, a short-lasting theta 
synchronization was observed. This increase in theta power was particularly 
strong at occipital areas and most likely reflects the encoding of new information 
at occipital sites. In a replication and extension of these findings, it was also 
demonstrated that the extent of upper alpha desynchronization is significantly 
correlated with semantic memory performance, whereas the extent of theta 
synchronization is significantly correlated with episodic memory performance 
(Klimesch et al., 1997a). In summary, in the theta band, episodic memory 
performance is associated with an event-related increase in power, whereas the 
opposite holds true for semantic memory and the upper alpha band (see also 
Klimesch et al., 2000). 

An additional prediction of the theta hypothesis is that during the encoding of 
those words that will later be remembered (e.g., in a recognition task), a 
significantly stronger theta synchronization is expected compared to words that 
cannot be remembered later. Another prediction is that during the actual 
recognition process, correctly recognized targets will show a significantly stronger 
increase in theta power compared to not recognized targets or distractors. These 
predictions were tested in another study (Klimesch et al., 1997c) in which a set of 
96 words was used as targets, with sub-samples of 16 words each comprised of 
one of 6 categories (birds, fruits, vegetables, vehicles, clothes, and weapons). The 
96 distractors were selected such that for each target (e.g., robin) a semantically 
similar distractor (e.g., sparrow) was presented, so that just as for the targets, the 
96 distractors are subdivided into the same 6 semantic categories, each comprising 
16 words. This similarity between distractors and targets guarantees that subjects 
must rely on episodic information rather than, for example, semantic familiarity to 
make a correct decision in the recognition task (e.g. if a subject has to distinguish 
the target word "robin” from the distractor "sparrow”, semantic information 
representing the meaning of the words will not be helpful). For a correct response, 
the subject has to remember which of the two words was presented in the context 
of the study list. Figure 2 presents the major findings. The increase in theta band 
power was significantly larger during the encoding of those words that were later 
remembered as compared to those which were not remembered later. In addition, 
during recognition, the increase in theta was larger for recognized targets than 
distractors or not recognized targets. 

In all of these studies, subjects knew that their memory will be tested later. 
Hence, the finding that theta synchronization is significantly stronger during the 
encoding of those words that can later be remembered could be due to specific 
encoding strategies. If this were the case, theta synchronization during encoding 
would reflect a rather unspecific factor and not the actual encoding of new 
information. One way to avoid the possible influence of encoding strategies is to 
use an incidental instead of an intentional memory paradigm. The usual procedure 
is that during the encoding phase of an incidental memory paradigm, subjects do 
not know that memory performance will be tested later, because they are 
performing some type of distractor task. Therefore, specific mnemonic techniques 
and specific attentional factors cannot play a significant role during encoding. 
Klimesch et al. (1996) used an incidental memory paradigm that consisted of two 
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parts. In the first part, subjects were asked to categorize a series of words and to 
respond with ‘yes’ if a word belonged to the category ‘living’ and with ‘no’ if a 
word belonged to the category ‘nonliving’. At this point of the experiment, 
subjects did not know that in the second part of the experiment they would have to 
recall the words that were presented in the judgment task. Band power values 
during the encoding stage were calculated and words that could be remembered 
later were compared with those words that could not be remembered later. The 
results again indicated that an event-related increase in theta reflects the actual 
encoding of new information, even with incidental learning of the stimulus words. 
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Figure 2. Memory related increases in theta band power during encoding and recognition of words. 
During encoding of those words that are later remembered in a recognition task, the increase in theta 
band power is significantly larger than for those that cannot later be remembered. The increase in 
theta during recognition is larger for words that can be remembered compared to not remembered or 
new words (after Klimesch et al., 1997c). 

The reported findings suggest that theta is a promising neural correlate of 
episodic and working memory in general (Samthein et al., 1998). Although the 
neural generators of this EEG frequency are not yet known, these findings are in 
accord with the hypothesis that episodic memory processes are reflected by theta 
oscillations in complex hippocampo-cortical reentrant loops (Miller, 1991). This 
assertion is also consonant with converging evidence for the existence of human 
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theta oscillations that comes from studies of patients using depth (Arnolds et al., 
1980) and subdural (Kahana et al, 1999) electrodes, as well as from studies with 
normal subjects using high resolution EEG (Gevins et al., 1997) and MEG 
(Tesche & Karhu, 2000). 

3.2 DELTA ACTIVITY IN THE HUMAN SCALP EEG 
AND MEMORY PROCESSES 

Although it has been found repeatedly that the extent of a memory related 
increase in band power is largest for the theta band, there is evidence that the delta 
band shows similar although attenuated results. As an example, recent findings 
have demonstrated that an increase in theta band power is not restricted to verbal 
material and is particularly strong during retrieval as compared to encoding 
(Klimesch et al., 2001). The experiment consisted of two parts, an encoding and a 
retrieval session. In the first part, a set of 120 pictures was presented one at a time 
and subjects were instructed to remember them. In the second part, the ‘old’ 
pictures were presented randomly intermixed with 120 ‘new’ pictures. Subjects 
had to decide whether a particular picture was presented previously during the 
study session. Figure 3 illustrates the results: a large increase in theta band power 
(within a frequency band of about 4-6 Hz) was found during encoding and 
retrieval. The largest increase (about 150%) was found during retrieval and for 
those pictures that were correctly retrieved from memory. Most interesting, 
however, is the finding that the delta band (about 2-4 Hz) shows similar results, 
whereas the lower alpha- 1 band reflects largely diminished effects with a weak 
tendency of desynchronization during the late poststimulus interval. 



4. MEMORY CONSOLIDATION IN SLEEP, DELTA AND 
THETA ACTIVITY 

An enduring, related hypothesis is that sleep helps to consolidate memories 
(Jenkins & Dallenbach, 1924). However, that memory consolidation during sleep 
is the result of an interaction between the type of memory and sleep stage has only 
recently been reported (for a review, see Bom & Plihal, 2000). As an example, 
there is increasing evidence that explicit or episodic memory is consolidated 
during slow wave sleep (SWS or stage 4 sleep) in early sleep, whereas implicit (or 
procedural) memory is consolidated during REM sleep, which dominates in late 
sleep. Most interesting, in early as well as in late sleep hippocampal activity 
appears to play a crucial role for consolidation but in different ways. Traditional 
research on memory consolidation has focused on REM sleep, with the basic idea 
that REM deprivation is the major cause for lowered memory performance. The 
procedure was to wake subjects during the experimental night when the EEG 
indicated the beginning of REM and to compare memory performance with a 
control group that experienced undisturbed sleep. Although REM deprivation may 
very well have unspecific detrimental effects on memory — which most likely are 
due to increased irritation, mood changes or decreased attention (cf. Van Hulzen, 
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1986; Van Hulzen & Coenen; Cipolli, 1995), the experimental findings with 
respect to episodic memory consolidation in human subjects were rather 
inconclusive. Whereas some studies found lowered episodic memory performance 
after REM deprivation (Cartwright, 1972; Kami et al., 1994; Lewin & Glaubman, 
1975; Tilley & Empson, 1978), others did not find any effects (Castaldo et al., 
1974; Ekstrand et al., 1971; Feldman & Dement, 1968; Greenberg et al., 1983; 
Muzio et al., 1972; Tilley & Empson, 1981). 



Percentage of increase in band power (as compared to a reference) during 
ENCODING and RECOGNIHON (H ITS, CORRECT REJECTIONS) 

Delta (2-4 Hz) Theta (4-6 Hz) Lowerl Alpha (6-8 Hz) 
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Figure 3, Memory related increases in band power during encoding and recognition of pictures. On 
the average, the largest increase in band power and the largest differences between conditions 
(encoding, hits and correct rejections) can be observed in the theta band. However, particularly at Pz, 
the delta band also shows large effects that are similar to the theta band. This outcome is in sharp 
contrast to the lower alpha- 1 band that shows only a small event-related increase in band power and 
small differences between conditions. Each band has a width of 2 Hz. Frequency limits represent 
group averages adjusted for individual alpha frequency from each subject (after Klimesch et al., 
2001 ). 



For these reasons, REM deprivation is not an adequate method to study 
memory consolidation during sleep, and more consistent findings have been 
obtained when the effects of early and late sleep (first versus second half of the 
experimental night) were distinguished (Ekstrand et al., 1977). This outcome is 
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important because SWS dominates in early sleep (up to 40% of the first 3 or 4 
hours of sleep show delta activity), whereas REM dominates during late sleep. Of 
particular interest are recent studies performed by Bom, Plihal, and colleagues 
(Bom & Fehm, 1998; Bom et al. 1991; 1999; Plihal & Bom, 1997; Plihal & Bom, 
1999a; Plihal & Bom, 1999b; Plihal et al., 1999). For example, Plihal and Bom 
(1997) orthogonally combined two experimental conditions, early vs. late sleep 
and episodic vs. implicit memory demands (word-pair association vs. mirror 
drawing). In the ‘early’ condition, subjects had to learn a memory task before 
sleep onset and were awakened after about 3 hours for testing (retrieval). In the 
‘late’ condition subjects were awakened about 3 hours after sleep onset to learn a 
task and were awakened about 3 hours later for testing. The same procedure was 
used for control groups with the exception that subjects were not allowed to sleep 
during the time when early or late sleep occurred. A clear pattern of results was 
obtained: Increase in episodic memory performance was about twice as much in 
the early compared to the late condition and the respective control group. In 
contrast, the increase in implicit memory performance was about twice as much in 
the late compared to the early condition and the control group. These findings 
have been replicated by using different episodic (nonverbal instead of verbal 
memory) memory tasks (Plihal & Bom, 1999a). 

4.1 THE INVOLVEMENT OF THE HIPPOCAMPAL 
FORMATION: HIPPOCAMPAL REPLAY AND 
DELTA ACTIVITY 

Given these findings, it appears plausible to assume that memory consolidation 
during early and late sleep is associated with specific functions of the hippocampal 
formation and related brain structures. First, during early SWS cortisol secretion 
reaches its minimum during the circadian cycle. Second, the hippocampal 
formation and related regions of the temporal lobe have a comparable high density 
of receptors for corticosteroids. Third, increased levels of cortisol are known to 
inhibit hippocampal functions via glucocorticoid receptors (Joels & DeKloet, 
1994), to reduce neurogenesis of granule cells (Gould et al., 1998), and to reduce 
memory performance (Pavlides et al., 1995). Moreover, for Plihal and Bom 
(1999b) were able to demonstrate that an experimental increase of cortisol (by 
infusion) during early sleep reduces episodic memory performance but has no 
effects during late sleep with implicit memory performance unaffected. In 
addition, it was shown that these detrimental effects are mediated by 
glucocorticoid but not by mineralocorticoid receptors. 

An important question remains as to how the hippocampal formation is 
involved in memory consolidation during SWS and REM sleep. Buzsaki (1989, 
1996, 1998) has suggested that the establishment of highly selective excitatory 
network connections (e.g., by LTP or LTD) in the CA3 collateral matrix is that 
area where episodic information is transiently stored (Wilson & McNaughton, 
1994). In this context, it is important to note that after periods of pronounced 
theta, large irregular field potentials (or SPWs) can be observed (see Figure 1), 
which have a frequency characteristic in the delta range. It is assumed that SPWs 
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are generated after theta ceases in that part of the hippocampal formation where 
highly selective excitatory network connections (reflecting freshly encoded 
information) were established. It is for this reason that SPWs might be considered 
correlates of memory consolidation. 

This hypothesis is supported by several lines of evidence. For example, Wilson 
and McNaughton (1994) recorded spike trains from 50 to 100 single cells in area 
Cl of the rat hippocampus during learning of a food reinforced spatial memory 
task, as well as during SWS sleep before and after task performance. According to 
the classical work of O’Keefe and Dostrovsky (1971), the preferred location for a 
given cell to fire (‘place field’) was determined. In a next step, cells with 
overlapping place fields were selected and cross-correlations computed. The 
results show that during encoding (performance of the memory task) cells with 
overlapping fields exhibit a highly correlated and rhythmic firing pattern within 
the theta frequency range. The crucial finding, however, is that in SWS sleep after 
task performance the activity of these cells was also highly correlated, but now the 
firing pattern was no longer rhythmic but irregular with a frequency in the delta 
range. In SWS sleep before task performance, these cells did not show correlated 
activity. In contrast, cells with non-overlapping fields were neither exhibiting 
rhythmic bursting during encoding nor correlated activity in the delta frequency 
range during SWS sleep after the memory task. Thus, hippocampal delta activity 
appears directly related to memory processing of new information. 



5. CONCLUDING REMARKS 

The reported findings from animal research and the human scalp EEG clearly 
support the hypothesis that theta synchronization is related to the encoding and 
retrieval of new information. Convergent evidence suggests that theta is related to 
episodic but not semantic memory. One reason for this conclusion is that only 
theta responds to episodic memory demands, whereas upper alpha responds to 
semantic memory demands. Furthermore, it is well established that the 
hippocampal formation together with a complex network involving neocortical 
and limbic areas are crucial regions for episodic memory processes (Markowitsch, 
1996). Tulving (1984) has theorized that episodic memory stores that type of 
contextual information that keeps an individual autobiographically oriented within 
space and time. Because time changes the autobiographical context permanently, 
there is a permanent and vital need to update and store episodic information. Thus, 
the formation of episodic memory traces is closely related to periods of increased 
conscious awareness and increased working memory demands. 

If we consider the model of hippocampal activity outlined in Figure 1 
(Buzsaki, 1996), two states must be distinguished. One state is defined by 
increased theta activity and is related to increases in episodic memory demands, 
which dominate in periods of increased conscious awareness (or exploratory 
behavior in animals). The other state is defined by SPWs or delta activity. It 
appears to reflect memory consolidation and is related to decreased episodic 
memory demands and decreased states of conscious awareness. Taken together it 
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can be concluded that delta activity during SWS sleep — ^which is a state of 
decreased conscious awareness — reflects consolidation of episodic memory. On 
the other hand, increased theta activity during REM sleep — which is a state of 
increased conscious awareness (although of a different type) — ^provides the 
necessary contextual ‘embedding’ for the encoding of implicit memory. It is 
therefore likely that the lack of context during SWS is helpful for the 
consolidation of episodic memory, whereas the establishment of a (dream) context 
during REM is helpful for the consolidation of implicit memory. 

Finally, some apparently contradictory results should be discussed. Many 
studies have reported that absolute (tonic) power in the theta frequency range 
increases with age (Cristian, 1984; Niedermeyer, 1993b) and is increased in 
demented subjects (Brenner et al., 1986; Coben et al., 1985). Because memory 
performance decreases with age (and is decreased in demented subjects), these 
findings seem to contradict the hypothesis that an event-related increase in theta 
power reflects episodic memory performance. However, it was demonstrated that 
event-related theta must be distinguished from tonic power as measured during 
rest or a reference interval preceding task performance. Indeed, tonic and event- 
related (or phasic) theta power behave in different ways (Klimesch, 1999). 
Doppelmayr et al. (1998b) demonstrated, that subjects with large tonic theta 
power show a significantly smaller event related increase than subjects with 
smaller tonic power who show a large event-related increase. Thus, tonic and 
phasic theta power are negatively correlated. Consequently, the finding of an age 
related increase in tonic theta power is quite consistent with the reported results 
about a memory related increase in (phasic) theta power. 
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Detection of change is one of the most prominent obligations of the 
human brain and can occur in any sensory modality. For example, when a 
young child is shown the same red card repeatedly, the child will lose 
interest in it — i.e., the response habituates. However, as a blue card is 
presented, the new stimulus will immediately result in responses signaling 
increased attention (Squire & Kandel, 1999). This pattern of 
habituation/attention is adaptive, since the processing system should not 
react to perceptions that stay constant over time. Instead, the system should 
focus on events that change suddenly and indicate the need for a response. 
This process reflects automatic or bottom-up attention. In addition, selective 
or top-down attention can be engaged such that a voluntary focus on selected 
portions of auditory and visual input occurs. The present chapter will review 
how high-frequency electroencephalographic (EEG) activity or “gamma” is 
germane to the attentional mechanisms underlying the detection of change. 



1. OSCILLATIONS IN THE EEG 

EEG analysis is one of the main methods used to investigate the 
functional behavior of the human and animal brain. Although physicians 
focus on continuous and relatively gross EEG recordings, specific sensory 
and cognitive processes can be measured by averaging EEG responses to 
stimuli to extract event-related potentials (ERPs). Both EEG and ERP 
measures can be investigated in the frequency domain, and it has been 
convincingly demonstrated that assessing specific frequencies can often 
yield insights into the functional cognitive correlations of these signals 
(Ba§ar et al., 1999). This result can be achieved by selectively filtering out 
those parts of the signal that oscillate at a given frequency. Since, in 
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principle, every signal can be decomposed into sinusoidal oscillations of 
different amplitudes; such decomposition is usually computed using the 
Fourier transform to quantify the oscillations that constitute the signal 
(Dumermuth, 1977). 

1.1 ALPHA RHYTHM BEGINNINGS 

Oscillations were the very beginning of EEG research when the German 
neurophysiologist Berger (1929) first observed the dominant oscillations of 
approximately 10 Hz recorded from the human scalp. Berger coined the term 
alpha frequency for activity in this frequency range by using the first letter 
of the Greek alphabet. 






Figure L Ten seconds of continuous EEG recorded with eyes closed showing slow alpha 
activity (7-8 Hz) in seconds 24 and 25. Eye-opening in seconds 26 to 31 results in suppression 
of alpha activity and subsequently leads to speeded alpha of 10-1 1 Hz in seconds 32 and 33. 

Berger dubbed the second type of rhythmic activity that he found in the 
human EEG as beta, which is now considered to be the frequency range of 
approximately 12-30 Hz. Following this consecutive ordering, Adrian (1942) 
referred to the oscillations around 40 Hz observed after odor stimulation in 
the hedgehog as gamma waves. Ba§ar-Eroglu et al. (1996b) describe this 
report as the first stage of gamma research. In this taxonomy, the second 
stage was initiated by Freeman (1975), who found 40 Hz was strongly 
associated with perceptual models of the rabbit's olfactory bulb. The third 
phase started with the work of Galambos et al. (1981), which made gamma 
oscillations generally accepted in studies of human perception. The fourth 
phase and a major influence of gamma activity research stemmed from Gray 
et al. (1989), who showed that synchronous firing of single neurons in the 40 
Hz range could help account for the 'perceptual binding' that produces a 
unitary conscious experience. Karaka§ and Ba§ar (1998) have helped to 
define the fifth phase, which is marked by an enormous number of different 
paradigms and methods applied to solve the 'gamma puzzle'. 
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1.2 GAMMA ACTIVITY AND ITS FUNCTIONAL 
ROLES 



Neurons in primary sensory cortex typically code simple features of 
perceived stimuli, such that perceptual objects are composed of various 
features, which are represented by different neurons in the brain. The neural 
activity that codes objects features is somehow bound together, to produce 
the perception and a coherent object. The so-called binding problem arises 
when multiple objects are perceived at one time, and their single features 
could potentially be bound incorrectly to produce illusory conjunctions. 
Thus, understanding the neurocognitive mechanisms of binding and 
attention is a fundamental and important cognitive problem. 

Oscillatory activity in the gamma frequency range (30-80 Hz) has been 
found to reveal correlates of processes that are associated with binding 
phenomena. In particular, neurons in the animal brain that oscillate at about 
40 Hz are believed to represent the binding of different features of one 
object to form a single coherent percept (Eckhom et al., 1988; Engel et al., 
1992; Gray et al., 1989). Figure 2 schematically illustrates this process. 
When one bar (object) is moved across the receptive fields of two neurons in 
cat visual cortex, the responses of these two neurons are synchronous (i.e., 
they spike at the same time) and their frequency occurs in the gamma range. 
When two bars move in the same direction (and are usually perceived as one 
interrupted object) the neurons still fire with some degree of synchrony. 
However, if two bars move in opposite directions, which will be perceived 
as two individual objects, the neural discharges are no longer synchronous. 




One moving bar 
(high synchrony) 



1 1 I I I I 

4 ^ ^ 

Coherently moving bars 
(intermediate synch.) 



^C> 

Incoherently moving bars 
(low synchrony) 



Figure 2. Black bars moving across the receptive fields (gray) of neurons in cat visual cortex 
and the neuronal response. Vertical lines indicate single-unit spikes in response to stimuli. 



Similar findings have been reported from the human EEG, which 
demonstrate higher induced gamma activity for one bar than for two 
incoherently moving bars (Mtiller et al., 1997). Such findings have been 
interpreted as reflecting gamma activity in the human EEG that is associated 
with visual binding. It has been found for illusory contours (Kanizsa figures. 
Section 4) where the inducer disks are bound together for the perception of 
the figure (Hermann et al., 1999; Tallon et al., 1995; Tallon-Baudry et al., 
1996). Induced gamma activity also has been reported when subjects 
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suddenly see a meaningful object in formerly non-meaningful stimulus (Keil 
et al., 1999; Tallon-Baudry et al., 1997). 

Another function reflected by gamma activity is attention. Tiitinen et al. 
(1993) demonstrated that the gamma response around 50 milliseconds after 
auditory tone pips was larger when subjects were instructed to attend to the 
ear where the stimulus occurred (attended condition) compared to when they 
were instructed to attend to the other ear (unattended condition). Similar 
findings in the visual modality revealed increased gamma activity over the 
occipital cortex for attended versus unattended flickering lights (Muller et 
al., 1998). Data from experiments with different Kanizsa figures designed to 
differentiate binding and attention processes also support the notion that 
attention is a main source for gamma activity (Hermann & Mecklinger, 
2000a; Hermann & Mecklinger, 2000b; Hermann et al., 1999). 

1 .3 FURTHER ROLES OF GAMMA ACTIVITY 

In addition to binding and attention as functional correlates of gamma, 
other processes also have been associated with gamma activity. Jokeit and 
Makeig (1994) and Miiller et al. (1998) have shown how gamma activity 
correlates with reaction times of fast and slow responders. Ba§ar-Eroglu et 
al. (1996a) related gamma activity with the formation of a stable percept in a 
multi-stable pattern (a Necker cube) by showing that before a pattern 
reversal, there is more gamma activity than during the stable perception of 
one of the two percepts. Miltner et al. (1999) have demonstrated how 
associative learning produces coherence of gamma activity in visual and 
motor areas when motor responses to light stimuli are learned. Reviews 
related to the functional relevance of gamma oscillations in humans and 
animals can be found in Ba§ar-Eroglu et al. (1996b). Additional reviews 
concerning the relation of gamma activity to human visual perception 
(Tallon-Baudry & Bertrand, 1999) and attentional mechanisms (MUller et 
al., 2000) are also available. 



2. TYPES OF GAMMA ACTIVITY 

According to a classification of different types of gamma activity by 
Galambos (1992), there are spontaneous, induced, and evoked gamma 
rhythms, all of which are differentiated by their degree of phase-locking to 
the stimulus (emitted gamma rhythms in response to omitted stimuli also 
have been observed, but these will not be considered here). In this 
framework, spontaneous activity is completely uncorrelated with the 
occurrence of an experimental condition. Induced activity is correlated with 
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experimental conditions but is not strictly phase-locked to its onset. Evoked 
activity is strictly phase-locked to the onset of an experimental condition. 

Figure 3 illustrates phase-locking. It is important to note that phase- 
locking, rather than time-locking, is the crucial parameter that determines 
whether or not activity is cancelled out or summed when multiple signals are 
analyzed. Averaging signals with temporal relations, as illustrated in Figure 
3a and b or c and d, they will sum since they are effectively phased-locked to 
the virtual stimulus at time point 125. Alternatively, if signals are time- 
locked but not phase-locked, as in Figure 3a and c, or neither time- or phase- 
locked, as in Figure 3b and d, single trials will cancel out in the average. 





Signal: 


Relation: 


Average: 


a + b 


time-locked and phase-locked 


add up 


a + c 


time-locked but not phase-locked 


cancel out 


c + d 


not time-locked but phase-locked 


add up 


b + d 


neither time- nor phase-locked 


cancel out 



Figure 3. Signals must be phase-locked, not time-locked to sum across multiple epochs. 

Some spurious oscillations in the gamma frequency range are present in 
the human EEG without correlation to experimental conditions during and 
between stimulation periods. This activity is considered to be spontaneous 
and usually cancels out completely if an averaged ERP is computed across 
enough stimulus repetitions. Thus, true oscillations likely originate from 
cognitive processes unrelated to the specific mental task being performed. 
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Figure 4. If oscillations occur at the same latency after stimulus onset with the same phase 
relative to stimulus onset in multiple trials (rows 1-4), they are considered evoked by the 
stimulus (left). If latency or phase jitter relative to stimulus onset, the oscillations are 
considered to be induced by the stimulus (right). Evoked activity sums up in the average 
(bottom row), while induced activity is nearly cancelled out. 



If oscillations occur after each stimulation but with varying onset times 
and/or phase jitter, they are considered as being induced by the stimulus 
rather than evoked and are not visible in the averaged ERP. Figure 3 (right) 
illustrates this outcome. Special methods have to be applied to record this 
type of activity (see Section 3). This type of gamma activity is assumed to 
reflect cognitive processes of binding and figure representation. 

Oscillatory activity in EEG can be phase-locked to the onset of an 
experimental stimulus, as it starts at approximately the same latency after 
stimulus onset for every repetition of the stimulus. Figure 3 (left) illustrates 
this outcome. In this case, the activity is called evoked, sums, and is visible 
in the averaged ERP. This type of activity is usually evoked by any kind of 
sensory stimulation, like auditory, visual or somatosensory stimulation. 

2.1 LATENCY 

Differentiation of gamma activity also can be made according to the 
latency at which it occurs after stimulus onset. Early gamma activity usually 
peaks around 100 milliseconds after stimulus onset and is evoked by the 
stimulus. It reflects early processes of stimulus encoding and attention. 
Tallon-Baudry et al. (1996, 1997) have shown that gamma activity can peak 
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at a later time. This late gamma activity is usually induced by the stimulus 
and reflects perceptual and cognitive processes. 



3. METHODS FOR GAMMA ANALYSIS 

The analysis of gamma and other EEG frequencies requires some 
precautions when data are recorded as well as specific frequency analysis 
tools. These are discussed next. 

Two important parameters for the recording equipment are critical to 
properly record gamma activity: (1) The sampling rate has to be set to a 
value that is at least twice the frequency that should be analyzed (four times 
is better and is required by some software). For example, if gamma activity 
up to 80 Hz will be analyzed, a minimum sampling rate of 160 Hz is needed 
and 320 is recommended. (2) The low pass filter needs to be set to a value 
higher than the highest frequency that should be analyzed. The low pass 
filter is usually integrated in the analog amplifier to prevent aliasing errors 
when digitizing analog data. This step is sometimes overlooked when trying 
to record gamma activity for the first time. It is also worth noting that the 
lower 3 dB edge frequency is the critical value of a low-pass filter and not its 
middle frequency. 

3.1 ARTIFACTS 

All artifacts that contaminate traditional ERP averages should be 
excluded from gamma analysis as well. In addition, there are several specific 
artifact conditions that are especially crucial when gamma activity is 
analyzed. 

A potential problem when recording gamma is the influence of alpha 
frequency harmonics. Whenever an oscillation is not purely sinusoidal, it 
leads to so-called harmonic frequencies at integer multiples of that 
frequency. For non-sinusoidal alpha activity around 10 Hz, one such 
harmonic can be in the gamma range (Jurgens et al., 1995). Figure 5 (left) 
illustrates how a pure sine wave of 10 Hz leads to exactly one spectral peak 
at 10 Hz, but even a slight change of its shape (right) can lead to harmonic 
peaks at 20 and 40 Hz. Hence, studies should ensure that the gamma activity 
behaves independently of the alpha activity, and that the alpha activity 
simply does not show the identical effects. Different latencies and different 
topographical distributions can serve to discriminate the two outcomes. 

In addition, it is important to emphasize that when differences in EEG 
data are absent between two experimental conditions, this result does not 
necessarily imply an absence of differences in the underlying brain 
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processes. It may be that the electric responses of the brain processes are 
different but they do not propagate all the way through brain and skull to 
affect EEG recordings. 





Figure 5. A sinusoidal 10 Hz wave leads to exactly one spectral frequency response at 10 Hz 
(left). A distorted sine wave can lead to additional harmonic frequencies spectrum (right). 

Another potential confound of human gamma activity is 
electromyography (EMG). If subjects sit uncomfortably or chew during an 
EEG session and innervate their muscles, the EEG electrodes will record 
EMG activity. This high frequency muscle-related activity (30-80 Hz) can 
be mistaken for gamma EEG activity. Therefore, all epochs that are 
subsequently averaged should be visually evaluated for the occurrence of 
such EMG artifacts, which should then be excluded from further analysis. 
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Figure 6. Clean EEG data and its frequency spectrum (left) and an epoch with EMG 
contamination leading to frequency peaks around 40 Hz. 
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Figure 6 shows ten seconds of clean EEG and the corresponding 
frequency spectrum with a 0 Hz and a 12 Hz peak (left). EMG activity can 
easily be detected in the time domain (right) but may be mistaken for gamma 
activity in the spectrum. 

3.2 FREQUENCY ANALYSIS METHODS 

Several methods exist to exclusively extract oscillations of a specific 
frequency from ERP data. Among the most popular are filtering, Fourier 
transformation, and wavelet analysis. 

Figure 7 shows the results of the three methods to extract frequency 
information. Left panel: filtering an ERP with a band pass filter (35-45 Hz) 
shows a clear burst of 40 Hz activity around 100 milliseconds. This 
oscillatory activity is enhanced for the dotted as compared to the solid 
condition. Middle panel: Fourier spectrum analyses of the time interval from 
50-150 milliseconds. An increase of activity for the dotted condition can be 
noticed around 40 Hz. Right panel: the absolute values of the wavelet 
transform of the ERP are shown for a 40 Hz wavelet. The difference 
between conditions is very prominent and can be observed at every point in 
time due to the lack of oscillations in the signal. The wavelet transform can 
be thought of as the envelope of the filtered ERP. The wavelet transform is 
ideally suited for ERP frequency analysis and will be discussed below. 





FFT Wavelet transform 



Figure 7. Three possibilities to extract frequency information from ERP data: a 35-45 Hz 
filtered ERP (left), the FFT spectrum of the epoch 50-150 milliseconds (middle) and the 
wavelet transform of the ERP (right). 



3.3 THE WAVELET TRANSFORM 



To compute a wavelet transform, the original signal is convolved with a 
wavelet function. In the case of the Morlet wavelet used here, it is calculated 
according to the formula 



T(f)= 
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where co is 2% times the frequency of the unshifted and uncompressed 
mother wavelet. Figure 8 schematically illustrates how these mathematical 
terms construct a wavelet. 



eiwo«| ^,-(2/2 m)\ 




a) b) c) 



Figure 8. Multiplying a sinusoidal function (a) and an envelope function (b) results in a 
wavelet (c). 

Mathematically convolving wavelets with signals produces a new signal 
(the convolution) that can be interpreted as the similarity of the wavelet to 
the signal. Wavelets can be compressed to obtain wavelets of different 
frequencies (substitute t by t/a, where a = compression factor). The mother 
wavelet (a = 1) has the same frequency as the sampling frequency (fs) of the 
signal. Wavelets of lower frequencies are computed by increasing a (e.g., if 
a = fs the wavelet has a frequency of 1 Hz). 

Convolving the signal and the shifted and compressed wavelet leads to a 
new signal 

= dt 

where T is the conjugate of the complex wavelet and x(t) is the original 
signal. These new signals s„ (b) are computed for different scaling factors a. 
For the experiments in Section 4, we calculated the gamma activity by using 
a wavelet that was compressed to 40 Hz. The scaling factor A = \/-J a is 
used to scale the wavelet prior to convolution. 

To represent phase-locked (evoked) activity, the wavelet transform is 
computed on the average over the single trials (the ERP). This is denoted by 
the formula WTAvg (Wavelet Transform of Average). Since the wavelet 
transform returns complex numbers, the absolute values are calculated. 

WTAvg =\A 1 J eeg, (t) off | 

\ 9 J n 

The baseline of the raw data in a time interval prior to stimulation needs 
to be subtracted from each EEG epoch prior to averaging. After calculating 
the gamma activity, the frequency-specific baseline activity at 40 Hz can be 
subtracted to yield values that indicate gamma amplitude relative to baseline. 
When wavelet convolutions are computed, the convolution peaks at the same 
latency as the respective frequency component in the raw data, although the 
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peak width will be smeared. Therefore, the baseline should be chosen to 
precede the stimulation by half the width of the wavelet (i.e., 150 
milliseconds for six 25 millisecond cycles of a 40 Hz wavelet) to avoid the 
temporal smearing of post-stimulus activity into the interval directly 
preceding the stimulus. To avoid distortions by the rectangular window 
function that can result from 'cutting out' a single epoch from continuous raw 
data, the convolution should start and end one wavelet length before the 
baseline and after the end of the assessed time interval. 

Figure 9 (left), depicts the convolution of an EEG with a wavelet that 
results in a new signal. These wavelet convolutions can be computed for 
multiple frequencies and the amplitudes of the convolutions can then be 
color- or gray-scale-coded in one single diagram. Figure 9 (right) illustrates 
this method, which is called a time-frequency representation. 

This time-frequency representation (WTAvg) contains only that part of 
the activity that is phase-locked to the stimulus onset. To compute the 
activity that is not phase-locked to stimulus onset (and is therefore cancelled 
out in the average), the sum of evoked and induced activity can be 
computed. To calculate the sum of all activity at one frequency, the absolute 
values of the wavelet transforms of the single trials are averaged (AvgWT), 
which means that each single trial is at first transformed and the absolute 
values are averaged subsequently. 
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Figure 9. Left: Convolving an EEG (top) with a moving wavelet (middle) results in a 
convolution (bottom). Multiple such convolutions can be mapped in a time frequency 
representation. This is shown for the evoked gamma activity (top) of the example in Figure 4, 
the sum of evoked and induced gamma activity (middle) and isolated induced gamma activity 
(bottom). 



This new time-frequency representation contains all activity of one 
frequency that occurred after stimulus onset, no matter whether it was phase- 
locked to the stimulus or not. As above, the 40Hz activity in a pre-stimulus 
interval (e.g., -400 to -150 milliseconds) can be subtracted to obtain a 
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relative measure. This sum of evoked and induced activity is also known 
simply as induced activity (Tallon-Baudry & Bertrand, 1999), since the 
absolute amount of evoked activity is small compared to the much higher 
absolute values of the summed activity. To obtain only activity not phase- 
locked to stimulus onset, the evoked activity (WTAvg) needs to be 
subtracted from the sum of evoked and induced activity (AvgWT), such that 
absolute measures are subtracted to obtain AvgWT-WTAvg (i.e., no baseline 
correction in the frequency domain). 



AvgWT = 






V a 



x,(f) dt\ 



4. GAMMA AND ATTENTION: AN ILLUSTRATIVE 
STUDY 

To assess whether gamma activity is related to top-down processes of 
attention, two experiments were conducted using the same stimuli but 
different tasks. It is assumed that electrophysiological responses that are 
identical across both experiments reflect bottom-up processes and do not 
affect top-down task requirements. If electrophysiological responses change 
between experimental conditions, these must reflect a top-down process. 

r ^ n ^ ^ I# 

^ ^ 

a) b) c) d) 

Figure 10. The four stimulus types used in the experiment: a) Kanizsa square, b) Kanizsa 
triangle, c) non-Kanizsa triangle, and d) non-Kanizsa square. 

Figure 10 shows the four stimuli that were used for the two experimental 
conditions. Two of the stimuli represent Kanizsa figures (a and b), while the 
others (c and d) have similar physical properties but do not constitute 
illusory figures. Note that in the latter figures the pac-men are rotated in such 
a way that none of the separate stimuli can be bound together into shapes by 
collinear line segments. Therefore, the two experimental conditions are 
designed to differentiate between binding and attention, since one stimulus is 
defined as target (to test attention) but two of them require binding for the 
perception of an illusory figure. In Experiment 1, the Kanizsa square was 
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defined as the target and had to be counted by the subjects (Herrmann et al., 
1999). In Experiment 2, the non-Kanizsa square (c) was defined as the target 
and was counted instead (Herrmann & Mecklinger, 2001). 

Figure 1 1 shows the ERPs to the four stimuli. As expected, the Kanizsa 
square elicited the largest P3 in Experiment 1, since it was counted. The 
non-Kanizsa square elicited the largest P3 in Experiment 2. The P3 is the 
only ERP component that represents a top-down process in these 
experiments. In Experiment 2, its amplitude was suppressed and latency 
prolonged, which indicates that the task is harder than in Experiment 1 . 




Figure II. ERPs from elecfrode Oz for Experiments 1 and 2. PI and N1 components are 
independent of task requirements. The P3 component is affected by the task change between 
experiments and is delayed in latency and reduced in amplitude. Kanizsa square (solid), 
Kanizsa triangle (dashed), non-Kanizsa square (dotted) and non-Kanizsa triangle 
(intermittently dotted). 

The PI and N1 components were constant across the two experiments. 
PI and N1 reflect the bottom-up processes of sensory input coding. Hence, 
PI was mainly affected by the number of pac-men in a figure. Triangles 
evoke larger PI amplitudes than the squares, which may be due to less 
extinction of the unsymmetrical shapes in the two hemispheres. Note that the 
Kanizsa figures elicited larger N1 amplitudes than the non-Kanizsa figures. 
This result suggests that the illusory figures are clearly processed by the 
subjects. 

Figure 12 presents the topographic distribution of the early evoked 
gamma activity (50 to 150 milliseconds) for the four different stimuli in 
Experiment 1. The Kanizsa square, which was defined as the target stimulus 
and mentally counted, clearly demonstrated stronger activation than the 
other three stimuli. This target effect could possibly mean that the early 
evoked gamma activity reflects a top-down mechanism of attention. Even 
though the four stimuli varied on two dimensions (figureness and number of 
pac-men) to differentiate between binding and attention, the use of a Kanizsa 
figure as a target may have produced a confound, as this stimulus is readily 
perceived as a “square.” However, Experiment 2 employed the non-Kanizsa 
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square as the target, and it was expected that this new target would evoke the 
early gamma activity as did the Kanizsa square in Experiment 1. 

e ^ ^ V I* 




target 

Figure 12. Topographic amplitude maps of the early gamma activity for the four stimuli of 
Experiment 1 . The target stimulus elicited the highest gamma response. 



Figure 13 presents the topographic amplitude distribution of the early 
evoked gamma activity for the stimuli of Experiment 2. The pattern is 
clearly different — the non-Kanizsa square now elicits the largest gamma 
response. This result is consistent with the hypothesis that the Kanizsa 
square and the non-Kanizsa triangle also elicit some gamma activity. These 
two stimuli share one of the figures of the target stimulus: The Kanizsa 
square is also a square, and the non-Kanizsa triangle is also a non-Kanizsa 
figure. 

r ^ n ^ V M 




target 



Figure 13. Topographic amplitude maps of the early gamma activity for the four stimuli of 
Experiment 2. Even though the four stimuli are identical to Experiment 1, the change of the 
task affects the gamma response. The target elicits the largest gamma response. 

The results of the early evoked gamma activity demonstrate that just like 
the P3 ERP, the early evoked gamma activity reflects top-down processes of 
attention. The P3 peaks around 400 milliseconds, but the early evoked 
g a mm a activity peaks much earlier around 100 milliseconds. It is 
noteworthy that the order of amplitude of the early evoked gamma activity 



GAMMA ACTIVITY IN THE HUMAN EEG 



181 



already resembles the pattern of reaction times about 500 milliseconds 
before the actual button press (Herrmann & Mecklinger, 2001). 



5. CONCLUSION 

Gamma activity in the human EEG and MEG is related to at least two 
cognitive functions: (1) Numerous studies have shown that binding induces 
40 Hz oscillations in humans. (2) The present experiments demonstrate that 
attention is even more important for gamma activity than binding together 
the pac-men of a Kanizsa figure. This leads to the question how the two 
processes of binding and attention interact with each other. Detecting 
changes in the environment nicely illustrates this interaction. 

Figure 14. When pac-men stimuli are bound together a pop-out of the resulting Kanizsa 
square is perceived. The meaningful Gestalt among randomly arranged pac-men is 
automatically attended by our visual system serving as an example how binding and attention 
interact. 

Figure 14 shows an example of the binding phenomenon. In visual search 
displays, which consist of numerous distractor pac-men and one Kanizsa 
square, the binding of the Kanizsa forming pac-men leads to an automatic 
pop-out of the Gestalt (Davis & Driver, 1994). That is, the focus of attention 
is directed towards the locus at which multiple pac-men can be bound to one 
coherent Kanizsa square. This outcome nicely demonstrates how closely 
binding and attention are related to each other. In view of this interaction it 
becomes clear why both processes result in similar electrophysiologic 
responses. Further research is required to differentiate which responses 
might be unique to binding on the one hand and attention on the other. 
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